[KinoSearch] Dynamic schemas - How?
Marvin Humphrey
marvin at rectangular.com
Tue Feb 27 01:03:31 PST 2007
On Feb 26, 2007, at 10:30 PM, Marc Elser wrote:
> I just took a look at KinoSearch 0.20_01 because I've been longing
> for this new release.
>
> To my very suprise, I saw that the index structure now is based on
> subclassing KinoSearch::Schema::FieldSpec. Well that is a big
> problem for users like me which dynamically create indexes based on
> columns in our sql-tables which can be flagged to be indexed. Of
> course statically defined subclasses of KinoSearch::Schema or not
> possible with this setup.
Maybe not, but you can simulate them, because Perl is dynamic.
> how can I define dynamic Schemas in KS 0.20???
At index time, it's possible, though kludgy.
for my $field_name (@field_names) {
eval qq|
package MySchema::$field_name;
use base qw( KinoSearch::Schema::Field );
|;
die $@ if $@;
}
MySchema->init_fields(@field_names);
That's essentially what I'm doing in my provisional implementation of
KinoSearch::Simple.
The bigger problem in your case is what to do at search time. KS no
longer stores information about what fields are indexed, analyzed,
stored, anything -- all that information is communicated via the
Schema. All that gets stored as far as field defs go is a per-
segment field-name-to-field-num mapping.
To kludge up a search-time Schema, you could maybe write a file with
the field names in it to the index directory, then read that file and
generate your Schema subclass on the fly at search-time, too. Not
the most elegant solution, but should be usable, no?
The eventual plan is to improve the situation over what exists in KS
0.15. Right now I have to dedicate most of my devel time to certain
large-scale performance optimizations, but here's some of what I have
in mind...
[ ... ]
OK, the rationale behind Schema got too long so I offloaded it to a
separate email.
[ ... ]
The next feature I'd planned to add to KinoSearch's Schema API is
something called DeepFieldSpec. It would allow KS to fake one-to-
many relationships by applying a common FieldSpec to class names
which share a common prefix.
Maybe we can bend that concept into something that fits your needs.
You don't know the field names in advance at index-time, but you must
know exactly how you're going to define the fields -- otherwise, you
couldn't make this work with KS 0.1x. So we have a field spec. We
just need to associate it with field names.
Are there multiple specs?
Do they ever change?
Do you ever need to add fields in the middle of an indexing session
or do you know them all up front?
What we probably need is a new KinoSearch::Schema class method, akin
to init_fields() but with one more layer of indirection. Instead of
telling your Schema about a field, you tell it about a FieldSpec
subclass and one or more field names. Are you with me? Could that
work for you?
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list