[KinoSearch] Stemming and scoring

Eamon Daly edaly at nextwavemedia.com
Wed Feb 14 15:30:59 PST 2007



Quoting the perldoc:

  Stemming reduces words to a root form. For instance,
  "horse", "horses", and "horsing" all become "hors" -- so
  that a search for 'horse' will also match documents
  containing 'horses' and 'horsing'.

Our search is having a lot of trouble with words such as
"intern" and "internal". Am I correct in assuming that when
indexing only the stem is stored, so that searches on
"intern" and "internal" will return the same documents with
equal scores? If not, is there a way to bump up the score of
exact matches, perhaps? If so, does anyone know of alternate
stemmers we can try-- plural to single, for instance? CPAN
is failing me for once.

____________________________________________________________
Eamon Daly
NextWave Media Group
Tel: 773 975-1115
Fax: 773 913-0970

_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch




More information about the kinosearch mailing list