[KinoSearch] KinoSearch Death
marvin at rectangular.com
Fri Oct 20 14:52:10 PDT 2006
On Oct 20, 2006, at 1:14 PM, Chris Nandor wrote:
> Another would be to copy all the index files to each httpd box,
> instead of
> using NFS. Pain.
Well, most of the time the index doesn't change very much, so you
wouldn't have to copy the whole thing every 5 minutes if you went
that route. Segments stick around as long as they can. The
fibonacci-based merge trigger is designed to minimize churn.
Check out how Doug set up Lucene for Technorati.
I'm also kind of curious about how many servers you can point at the
same NFS volume before you end up i/o bound.
> Also, I wonder ... could search() fail and *not die* like that?
> Maybe only
> part of the file is gone? Or is this all-or-nothing? I *think*
> from what
> I understand of the problem, it will fail entirely or work
> entirely, but I
> lack full confidence in that assessment.
The InStream class throws that error when it tries to read something
that ought to be there and fails. KS checks the return value for
every read call. There are very, very few opportunities for a read
failure to produce incorrect data.
An InStream that gets out of sync has the potential to produce
invalid output for a little bit. But it usually dies almost
immediately -- typically when it tries to read a string header vint
and decoded vint tells it that the string is waaaaaayyyy longer than
it actually is. The instream tries to read that many bytes, slams
into an EOF, and throws a error.
KinoSearch mailing list
KinoSearch at rectangular.com
More information about the kinosearch