[KinoSearch] Stemming and scoring
Eamon Daly
edaly at nextwavemedia.com
Thu Feb 15 08:35:38 PST 2007
: If you can spare the resources, you can try indexing the same data
: twice -- once with stemming and once without. That will provide the
: bump for exact matches.
Thanks for the quick reply! I see that in the spec_field I
can specify an analyzer, but I don't see an equivalent for
the Searcher. On the index side, I think you mean I'd do:
my $analyzer_with_stemming = KinoSearch::Analysis::PolyAnalyzer->new
(
language => 'en'
);
my $analyzer_without_stemming = KinoSearch::Analysis::PolyAnalyzer->new
(
analyzers =>
[
KinoSearch::Analysis::LCNormalizer->new,
KinoSearch::Analysis::Tokenizer->new,
]
);
# ...
$invindexer->spec_field
(
name => 'title',
analyzer => $analyzer_without_stemming,
boost => 3,
);
$invindexer->spec_field
(
name => 'title_stemmed',
analyzer => $analyzer_with_stemming
);
But I don't see an equivalent on the search side:
my $query_parser = KinoSearch::QueryParser::QueryParser->new
(
analyzer => $analyzer,
fields => [ 'title', 'title_stemmed' ],
default_boolop => 'AND',
);
I suspect I have to go the long way 'round and build a
QueryParser of my own. Correct?
____________________________________________________________
Eamon Daly
NextWave Media Group
Tel: 773 975-1115
Fax: 773 913-0970
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list