[KinoSearch] vectors + large indices
Marvin Humphrey
marvin at rectangular.com
Fri May 1 09:23:20 PDT 2009
On Fri, May 01, 2009 at 08:23:08AM -0700, webmasters at ctosonline.org wrote:
> Well, I’m back again after about a year.
Nice to hear from you again. :)
> I’m having a problem with a KinoSearch index that seems to have its
> vector info corrupted—either that or the code retrieving the vectors
> is buggy. I’m still using revision 3122, but I’m going to upgrade and
> try reproducing the problem with the latest revision.
FYI, there have been many major changes since then. 3122 is not compatible
with current svn trunk in terms of either API or file format. With regards to
highlighting:
* The "vectorized" flag has been renamed to "highlightable" and is now off
by default.
* Highlighter, like everything else, has been ported to C.
* Highlighter's internal workings have changed. This process is incomplete
and the present implementation is buggy.
I plan to adapt Highlighter to approximate the algorithm discussed at
<https://issues.apache.org/jira/browse/LUCENE-1522> and cure its bugginess
prior to the next release. This is on my to-do list following finishing
segment-centric sorting and real-time indexing.
> I just wanted to bring it up first in case you are already aware of any
> such problem.
IIRC, there may have been something wrt the addition of segments with no
highlight data resulting in bogus empty files. But that applied to the stable
branch, and I don't recall whether it also afflicted svn trunk.
Marvin Humphrey
More information about the kinosearch
mailing list