[KinoSearch] Stemming and scoring

Eamon Daly edaly at nextwavemedia.com
Thu Feb 15 08:35:38 PST 2007



: If you can spare the resources, you can try indexing the same data  
: twice -- once with stemming and once without.  That will provide the  
: bump for exact matches.

Thanks for the quick reply! I see that in the spec_field I
can specify an analyzer, but I don't see an equivalent for
the Searcher. On the index side, I think you mean I'd do:

  my $analyzer_with_stemming = KinoSearch::Analysis::PolyAnalyzer->new
    (
     language => 'en'
    );

  my $analyzer_without_stemming = KinoSearch::Analysis::PolyAnalyzer->new
    (
     analyzers =>
     [
      KinoSearch::Analysis::LCNormalizer->new,
      KinoSearch::Analysis::Tokenizer->new,
     ]
    );

  # ...

  $invindexer->spec_field
    (
     name     => 'title',
     analyzer => $analyzer_without_stemming,
     boost    => 3,
    );

  $invindexer->spec_field
    (
     name     => 'title_stemmed',
     analyzer => $analyzer_with_stemming
    );

But I don't see an equivalent on the search side:

  my $query_parser = KinoSearch::QueryParser::QueryParser->new
    (
     analyzer       => $analyzer,
     fields         => [ 'title', 'title_stemmed' ],
     default_boolop => 'AND',
    );

I suspect I have to go the long way 'round and build a
QueryParser of my own. Correct?

____________________________________________________________
Eamon Daly
NextWave Media Group
Tel: 773 975-1115
Fax: 773 913-0970


_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch




More information about the kinosearch mailing list