Professional Web Applications Themes

tweaking full-text retrieval - MySQL

Hi, I want to use MySQL's full-text retrieval, but would need to optimize it for my application. It seems possible to switch between two term weighting schemes (IDF and IDFP), is there any way have greater control over this? More importantly, I will need to adjust the doent length normalization, which is completely inadequate for my purpose. Is this possible? I will also need to adjust the stop-word list, reduce the minimum length of indexed words, and remove the 50% frequency cutoff, all of which seem to be nicely doented. No problems there, at least. Thanks, Jens...

  1. #1

    Default tweaking full-text retrieval

    Hi,

    I want to use MySQL's full-text retrieval, but would need to optimize
    it for my application.

    It seems possible to switch between two term weighting schemes (IDF and
    IDFP), is there any way have greater control over this?

    More importantly, I will need to adjust the doent length
    normalization, which is completely inadequate for my purpose. Is this
    possible?

    I will also need to adjust the stop-word list, reduce the minimum
    length of indexed words, and remove the 50% frequency cutoff, all of
    which seem to be nicely doented. No problems there, at least.

    Thanks,
    Jens

    Jens Guest

  2. #2

    Default Re: tweaking full-text retrieval

    Hi,

    You could adjust the doent length normalization factor by modifying
    PIVOT_VAL in myisam/ftdefs.h and re-compiling.

    The ft_stopword_file system variable controls the list of stopwords.
    ft_min_word_len controls the minimum word length. There's some more
    info at:

    <http://dev.mysql.com/doc/refman/5.0/en/fulltext-fine-tuning.html>

    Boolean mode FTS doesn't use the 50% threshold. Examples here:

    <http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html>

    If you need more control, you might try using the lucene search engine.

    Jens Grivolla wrote: 

    petersprc@gmail.com Guest

  3. #3

    Default Re: tweaking full-text retrieval

    Hi,

    com wrote: 

    Ok, I am now inclined to go for boolean matching (which apparently
    counts the number of different matches) and normalize by length
    afterwards. Unfortunately, it doesn't seem to count the number of
    times a term was matched in the doent.

    Anyway, how can I get the number of words in a string? I couldn't find
    anything like that in the string functions section of the manual.
     

    It's not really a natural language doent retrieval application, I'm
    just abusing it for my purpose. And I'd really like to be able to work
    directly on our databases.

    Thanks for your help.

    Ciao,
    Jens

    Jens Guest

  4. #4

    Default Re: tweaking full-text retrieval

    There are some comments code here which might be useful in counting
    substrings, or splitting a string into words:

    http://dev.mysql.com/doc/refman/5.0/en/string-functions.html

    Jens Grivolla wrote: 
    >
    > Ok, I am now inclined to go for boolean matching (which apparently
    > counts the number of different matches) and normalize by length
    > afterwards. Unfortunately, it doesn't seem to count the number of
    > times a term was matched in the doent.
    >
    > Anyway, how can I get the number of words in a string? I couldn't find
    > anything like that in the string functions section of the manual.

    >
    > It's not really a natural language doent retrieval application, I'm
    > just abusing it for my purpose. And I'd really like to be able to work
    > directly on our databases.
    >
    > Thanks for your help.
    >
    > Ciao,
    > Jens[/ref]

    petersprc Guest

  5. #5

    Default Re: tweaking full-text retrieval

    petersprc wrote: 

    Thanks, that really looks like some of those examples do what I need.
    I did read the doentation (probably even on that same page), but
    completely missed the comments.

    Jens Guest

Similar Threads

  1. Full Text display problem in MySQL Text column
    By Clint_Ribble in forum Coldfusion Database Access
    Replies: 2
    Last Post: November 8th, 05:09 PM
  2. CFMX 7 SQL TEXT FIELD Long Text Retrieval
    By eblackey101 in forum Macromedia ColdFusion
    Replies: 0
    Last Post: March 16th, 03:49 PM
  3. Full-text Index bug
    By Dinesh.T.K in forum Microsoft SQL / MS SQL Server
    Replies: 4
    Last Post: July 22nd, 07:02 PM
  4. full text search
    By Tommy in forum Microsoft SQL / MS SQL Server
    Replies: 0
    Last Post: July 9th, 01:36 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139