Ask a Question related to PERL Modules, Design and Development.

  1. #1

    Default SpellCheck in perl

    Hi,



    We need to check the spelling of a word which is actually a Domain
    name. For example we have to check the word " onlinetradeing ".
    When checked with the spell checkers we are getting the words which are
    unrelated such as on, obliterating, incinerating, intruding etc. But
    exactly what we want was " online trading ". So we would like to
    have the word to be split into phrases and check the spelling too. The
    normal spell checkers are just checking the words in the dictionary but
    not splitting the word into phrases.



    Regards
    L.Srikanth.

    Srikanth Guest

  2. Similar Questions and Discussions

    1. InDesign & Indesgn CS Spellcheck problems
      Hi, We have several documents (Templates created in CS and exported as Indesign 2 documents for editing) where spellcheck is not working - the...
    2. Dictionary language switches language during spellcheck
      During a spellcheck of a complete document, the language mysteriously switches from English USA (which in Canada isn't quite right, but that's...
    3. Off Topic: Active Perl Native Windows / cygwin perl
      I have both activestate windows native perl installed and the default cygwin perl. How can I have the cygwin shell use the windows perl rather...
    4. spellcheck Word document after data merge with Filemaker
      For some reason, I cannot spell check a document that was created from a data merge with Filemaker Pro. I run Spellcheck, it will tell me "the...
    5. spellcheck the whole site
      How can I spell check the whole site, instead of the current document?
  3. #2

    Default Re: SpellCheck in perl

    Once upon a time, "Srikanth" <srikanthkumar.lingala@gmail.com> said:
    >
    >We need to check the spelling of a word which is actually a Domain
    >name. For example we have to check the word " onlinetradeing ".
    >When checked with the spell checkers we are getting the words which are
    >unrelated such as on, obliterating, incinerating, intruding etc. But
    >exactly what we want was " online trading ". So we would like to
    >have the word to be split into phrases and check the spelling too. The
    >normal spell checkers are just checking the words in the dictionary but
    >not splitting the word into phrases.
    I think this is the wrong newsgroup for this kind of request. You
    should probably take this kind of request to comp.lang.perl.misc.

    That aside, it sounds like you need to write a little code that
    tries to match the first N characters of your domain names against
    words in your dictionary, and for each match try the same against
    the remainder of the domain name (after the matched word), and so
    on until some combination of matches matches the entire phrase.

    Perhaps something like this (warning, untested code):

    my %DICT = (); # initialize this with dictionary words

    sub matcher
    {
    my ( $phrase, @components ) = @_;
    my $plen = length ( $phrase );
    for ( my $i = $plen; $i > 0; $i-- ) # try to match more, first
    {
    my $frag = substr ( $phrase, 0, $i );
    next unless ( defined($DICT{$frag}) );
    return ( "MATCH FOUND", @components, $frag ) if ( $i == $plen );
    push ( @components, $frag );
    return ( matcher ( substr ( $phrase, $i+1 ), @components ) );
    }
    return ( "NO MATCH POSSIBLE", @components, $phrase );
    }

    my ( $result, @word_list ) = matcher ( "onlinetrading" );
    # $result should now be "MATCH FOUND"
    # @word_list should now be ( "online", "trading" )

    A problem with this "greedy" approach is that a subphrase might
    match too much, rendering the reamining fragment unmatchable, for
    instance matcher("maileditorial") would fail to parse the entire
    phrase if "mailed" were in the dictionary. The alternative would
    be to build up a list of intermediate results, for each substring
    that matched some word in the dictionary, and call matcher() on
    each component of that list iteratively. This would explore all
    possible matches.

    Good luck!
    -- TTK
    TTK Ciar Guest

  4. #3

    Default Re: SpellCheck in perl

    Thanks for the idea....i already tried this and as you said i got lot
    of suggesting words which is a big problem to handle all those word
    lists and find best suggesting words. But Google is one example of what
    we wanted but unfortunately the code is unreachable for us to do this
    kind.

    Srikanth Guest

  5. #4

    Default Re: SpellCheck in perl

    Once upon a time, "Srikanth" <srikanthkumar.lingala@gmail.com> said:
    >
    >Thanks for the idea....i already tried this and as you said i got lot
    >of suggesting words which is a big problem to handle all those word
    >lists and find best suggesting words.
    I'm not sure what you mean. Do you need a word list? Word lists
    are often called "lemmas". I have a pretty good one left over from
    an AI project which has 247266 words in it that you can use. It is
    available for download at:
    [url]http://aux.ciar.org/ttk/lemma.ttkciar.01.txt[/url]

    This file is text, and has a number and a word on each line,
    separated by a tab. The number is the relative frequency of the
    word in the domain of the original project. If you can't use the
    frequency, then it's pretty easy for you to strip it out.
    >But Google is one example of what
    >we wanted but unfortunately the code is unreachable for us to do this
    >kind.
    I have no idea what this means. Can you try saying it in a
    different way?

    Good luck,
    -- TTK
    TTK Ciar Guest

  6. #5

    Default Re: SpellCheck in perl

    Thanks...
    I will explain my problem....
    I am working on a spell checker which will input wrongly spelt keywords
    (only keywords not multiple keywords or Text) and suggest some correct
    words. For example if i entered "tradeing" my spell checker suggesting
    that keyword should be "trading". But if I try to Spellcheck a compound
    word with out delimiter like "onlinetradeing" which is wrongly
    spelt...it's suggesting "unlaundered" which is irrelavant. Its not
    recognizing onlinetradeing as "online trading". If you want another
    example for this kind..."virtaulflowers" which should be "Virtual
    Flowers".
    If you have any idea plz let me know....

    Thanks for replying...

    Regards,
    Srikanth.

    Srikanth Guest

  7. #6

    Default Re: SpellCheck in perl

    Srikanth schreef:
    > Thanks...
    > I will explain my problem....
    > I am working on a spell checker which will input wrongly spelt
    > keywords (only keywords not multiple keywords or Text) and suggest
    > some correct words. For example if i entered "tradeing" my spell
    > checker suggesting that keyword should be "trading". But if I try to
    > Spellcheck a compound word with out delimiter like "onlinetradeing"
    > which is wrongly spelt...it's suggesting "unlaundered" which is
    > irrelavant. Its not recognizing onlinetradeing as "online trading".
    > If you want another example for this kind..."virtaulflowers" which
    > should be "Virtual Flowers".
    > If you have any idea plz let me know....
    >
    > Thanks for replying...
    > NNTP-Posting-Host: 202.63.122.130
    [url]http://cbl.abuseat.org/lookup.cgi?ip=202.63.122.130[/url]

    This same question was asked by you in news:comp.lang.perl.misc and has
    already grown a thread there. You were already told that you shouldn't
    multi-post. Now you do it again. Bye.

    --
    Affijn, Ruud

    "Gewoon is een tijger."

    Dr.Ruud Guest

  8. #7

    Default Re: SpellCheck in perl

    Thanks Ruud....All are giving some help regarding this spell check But
    U have given far better help for me...This is the way of helping
    people....Right?

    Srikanth Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139