[KinoSearch] Boosting doc scores
Dermot
paikkos at googlemail.com
Mon Apr 12 10:34:44 PDT 2010
On 12 April 2010 14:45, Peter Karman <peter at peknet.com> wrote:
> Dermot wrote on 04/12/2010 05:20 AM:
>
> In my reading of the code, the doc_boost is a float and can be as big as
> a float can be. It gets passed through to the underlying Posting::*
> class, which uses it like this:
>
> float field_boost = doc_boost * FType_Get_Boost(type) * length_norm;
>
> so it ends up being applied to all fields in the doc. Will it over-ride
> the relevance? Depends on how big it is, I guess. The whole point is to
> skew the raw IDF/TF score in one direction or another. How much it is
> skewed will depend on a host of factors. If it were me, I would start
> small (e.g. 2.0 or twice the normal) and see how it affects the
> rankings. You're looking for a sweet spot where it affects them just
> enough to privilege what you're after and not so much that it drowns out
> reasonable rankings. Like a salad dressing.
A beautiful analogy and you've confirmed what I though. Boosting at
index time would be a rather blunt tool but worth an experiment.
>> If I can't get satisfactory results by boosting at index time, I'll
>> have to attempt the far tricky business of boosting at search time.
>> Option one would be preferable :)
>
> Search-time would give you much more control since you could alter
> rankings based on the actual query and/or resultset, rather than a
> one-size-fits-all approach at indexing time. But like you point out,
> it's more work.
It is indeed and I suspect there will be a lot of questions on how to
work the PreFixerScore and Matcher. From a quick look at
~/KinoSearch-0.30_10/lib/KinoSearch/Docs/Cookbook/CustomQuery.pod#PrefixScorer
I'd say by overriding the score() method but that seems a too easy to be true.
Dp.
More information about the kinosearch
mailing list