Professional Web Applications Themes

Help with Keyword search - MySQL

Hi all, I've got a tricky query I hope you guys can help me with. I have a database of a few millions records, representing subjects from messages, emails, articles and other sources. We need to perform fast queries on these to answer clients' requests. I have created two table related to the search in question: table_Subject: keeps the subject of the message with a reference number and an autoinc Id number, like this: ID: 1, SUBJECT: How to unsintal?, REF: 138 ID: 2, SUBJECT: How to get updates, REF: 4668 etc table_Words: keeps a numeric hash (crc32 for now) ...

  1. #1

    Default Help with Keyword search

    Hi all,

    I've got a tricky query I hope you guys can help me with.

    I have a database of a few millions records, representing subjects from
    messages, emails, articles and other sources. We need to perform fast
    queries on these to answer clients' requests.

    I have created two table related to the search in question:

    table_Subject: keeps the subject of the message with a reference number and an
    autoinc Id number, like this:

    ID: 1, SUBJECT: How to unsintal?, REF: 138
    ID: 2, SUBJECT: How to get updates, REF: 4668
    etc

    table_Words: keeps a numeric hash (crc32 for now) of all words in the subject
    of an email and the ID linking to the subject table's ID, like this:

    ID: 1, SUBJECT_ID: 1, WORD_HASH: 25739953, WORD: how
    ID: 2, SUBJECT_ID: 1, WORD_HASH: 395739, WORD: to
    ID: 3, SUBJECT_ID: 1, WORD_HASH: -1847343, WORD: uninstall

    ID: 4, SUBJECT_ID: 2, WORD_HASH: 25739953, WORD: how
    ID: 5, SUBJECT_ID: 2, WORD_HASH: 395739, WORD: to
    ID: 6, SUBJECT_ID: 2, WORD_HASH: -1847343, WORD: get
    ID: 7, SUBJECT_ID: 2, WORD_HASH: -2946387, WORD: updates

    etc. I'm using a hash function for speed, as a string search is WAY too slow,
    even if the word is the primary index. The database is also expected to grow
    considerably as we import more stuff in it. For 2 million emails, we had more
    than 20 millions word entries. Search for a single word takes a couple of
    seconds, which subsequence searches (hitting NEXT for next page of same query)
    takes less than 0.05 second.

    But we want to be able to search for two or more words like "how" and
    "uninstall" or "how" and not "updates", etc. My problem is that I can't seem to
    be able to come up with the right select query. For example if I want to search
    for "how" and "uninstall", I end up with records that contain either words, not
    both of them. And I simply cannot get the "not" to work.

    Someone mentioned views, but I'm not sure how that would slow things and wanted
    to know if you guys had any pointers.

    Thanks in advance!

    --Eric

    Eric Fortier Guest

  2. #2

    Default Re: Help with Keyword search

    It sounds like you're doing a full text search. You might look into
    using the MyISAM fulltext search index - that should provide you with
    fairly rapid results. You can do boolean full text searches as well,
    and obtain rankings based on relevancy.

    Another alternative is to use Lucene, which is designed for what you
    are doing - full text searches. Lucene is a java product, and I
    believe it's part of the Apache Foundation' set of products.

    --
    Chander Ganesan
    Open Technology Group, Inc
    Expert MySQL Training
    [url]http://www.otg-nc.com/training-courses/category.php?cat_id=1[/url]

    Chander Ganesan Guest

  3. #3

    Default Re: Help with keyword search

    Hi Chander!
    > It sounds like you're doing a full text search. You might look into
    > using the MyISAM fulltext search index - that should provide you with
    > fairly rapid results. You can do boolean full text searches as well,
    > and obtain rankings based on relevancy.
    Thanks a lot for suggesting the fulltext option. I thought I had tried this
    option but it turns out I did not. Bottom line is that it increased the speed
    so much that queries are almost instantaneous now, even on millions of records!
    And our server is an old P3 700 Mhz (with lots of memory and fast drives, but
    still!).

    > Another alternative is to use Lucene, which is designed for what you
    > are doing - full text searches. Lucene is a java product, and I
    > believe it's part of the Apache Foundation' set of products.
    I've briefly looked at it, and it seems really fast indeed. There's a bunch of
    port to C++, Delphi and other languages too.

    Thanks a lot for your help.

    --
    Eric Fortier
    Tech Logic, Inc.
    [url]http://www.tlnewsreader.com[/url]
    [url]http://www.techlogic.ca[/url]

    Eric Fortier Guest

Similar Threads

  1. Auto Linker with keyword search
    By workdasheddotcom@adobeforums.com in forum Adobe Acrobat Macintosh
    Replies: 1
    Last Post: September 8th, 08:34 PM
  2. Keyword search on 5 fields
    By ToniM in forum FileMaker
    Replies: 0
    Last Post: August 5th, 01:37 AM
  3. Site Keyword Search Required!!!!!
    By David Lozzi in forum ASP Components
    Replies: 8
    Last Post: July 28th, 10:50 AM
  4. Replies: 1
    Last Post: July 17th, 04:22 AM
  5. Can a keyword search launch movie/go to marker?
    By Yenny webforumsuser@macromedia.com in forum Macromedia Director Basics
    Replies: 0
    Last Post: July 10th, 05:00 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139