[KinoSearch] Serialized Schema
marvin at rectangular.com
Sat Sep 29 15:32:43 PDT 2007
On Sep 6, 2007, at 5:40 PM, Peter Karman wrote:
> (And re: the url thread above: for the record, I like the .yml
> format better than .xml; if libswish3 weren't already possessed of
> a full XML parser, I would probably use .yml in Swish3 too.)
Have you considered JSON? :)
I'm annoyed by the fact that there isn't a minimal "YAML level 1"
spec. The complete YAML spec is grievously afflicted by featuritis.
Here's the problem: right now, KS uses custom routines to read/write
a small subset of YAML. But if other implementations start using the
file format, it will be easy for them to produce something that's
valid YAML but that KS isn't prepared to handle.
This is sort-of solvable by adding a fully compliant YAML parser to
the KS dependency chain -- which naturally I intend to avoid. But
the general problem would still exist: so long as the invindex file
format specifies "YAML", any implementation would be required to have
a complete -- and thus monstrous -- YAML parser to read externally
generated invindexes reliably.
CPAN has YAML::Tiny, which was inspired by the same sense of
revulsion I feel when perusing the YAML spec. Unfortunately, it's a
non-specific subset implementation, not a strictly defined spec.
I'm tempted to write a formal spec called "ASHL" -- Array Scalar Hash
Language. The target for ASHL level 1 would be non-trivial config
files -- basically, stuff that's too complex for .ini-style pairs.
It would use YAML's indentation and its notation for hashes and
arrays, but scalars would be single-line only and the character set
would be limited to ASCII.
The problem with that idea, though is that when you expand it
outwards to ASHL level 2, you need to add unicode escapes and multi-
line scalars. At that point, it starts to look an awful lot like
JSON, and it's hard to justify as an independent format.
KinoSearch mailing list
KinoSearch at rectangular.com
More information about the kinosearch