Home Account

unique id with multiple tables using sphinx

2012-05-28 03:27 dennis iversen

Tags: mysql sphinx

I had some troubles creating unique ids using sphinx. I needed this because I wanted to search multiple tables where ids may collide. Sphinx needs a unique id for every document, so it is not an option to use the auto incremented ids from the tables as they most likely will collide across tables.

My solution was to use the MySQL function UUID_SHORT which generates a large random int as identifier (a 64-bit unsigned integer).

If you use this on the 0.9.9 version of sphinx (which is the default current version on my 11.04 Ubuntu), you will get an error like this:

WARNING: DOCID_MAX document_id, skipping

This is because my Linux distro sphinx is not build with support for 64-bit unsigned integer.

You can choose to build it yourself with

–enable-id64

Which did not work for me right away, so I found out that the sphinx website have some newer versions for download. I choose the latest sphinx for Ubuntu and it worked straight away. I use a configuration for my two sphinx sources that looks something like this:

source content_article 
{

    sql_query               = \
        SELECT \
            UUID_SHORT() AS id, \
            id AS real_id, \
            1 AS category, \
            user_id, \
            abstract, \
            article AS content, \
            UNIX_TIMESTAMP(updated) AS date_added, \
            title \
        FROM content_article

    sql_attr_uint           = user_id
    sql_attr_uint           = real_id
    sql_attr_uint           = category

}

source blog : content_article
{
    sql_query               = \
        SELECT \
            UUID_SHORT() AS id, \
            id AS real_id, \
            user_id, \
            2 AS category, \
            abstract, \
            entry AS content, \
            UNIX_TIMESTAMP(updated) AS date_added, \
            title \
        FROM blog

    sql_attr_uint           = user_id
    sql_attr_uint           = real_id
    sql_attr_uint           = category
}

Then I can search on both articles and blog entries or just blog entries or articles (filter with category id)

This page has been requested 8921 times