[KinoSearch] question on querying by url
Filippo A. Salustri
salustri at ryerson.ca
Mon Apr 17 10:31:49 PDT 2006
No luck yet.
I rebuild the database, to make sure that the 'url' field had
analyzed=>0 for all records.
But when I do
$ks->search("url:http://blah.....")
I still get no hits.
Will keep futzing around with it. But further ideas would be welcome.
Cheers.
Fil
Marvin Humphrey wrote:
>
> On Apr 17, 2006, at 5:40 AM, Filippo A. Salustri wrote:
>
>>> $ki->spec_field ( name => 'url', boost => 1, indexed => 1, analyzed
>>> => 1,
>>> stored => 1, compressed => 0 );
>
> Unless you want to search for individual chunks within a URL (which
> would be a pretty unusual technique), the field should not be analyzed.
>
> $ki->spec_field (
> name => 'url',
> boost => 1,
> indexed => 1,
> analyzed => 0, # !!
> stored => 1,
> compressed => 0,
> );
>
>> Say the url I want to search for is by
>> $q = "http://deseng.ryerson.ca/~fil";
>> I then do:
>>> my $ks = KinoSearch::Searcher->new
>>> ( invindex => "$serfcgi/db",
>>> analyzer => KinoSearch::Analysis::PolyAnalyzer->new(language =>
>>> 'en'),
>>> );
>>> return $ks->search($q);
>>
>> I get an error saying that "http" is not a valid field name. That's
>> cool - I understand why it would do that.
>
> Thanks for illustrating exactly why this parser behavior must be
> documented. :\
>
> I think you've illustrated a second problem as well: KinoSearch is dying
> when presented with an invalid field name, but it should just return an
> empty result set instead.
>
>> So I do
>> $q = "url:http://deseng.ryerson.ca/~fil";
>>
>> Now the search returns 0 hits.
>>
>> Any ideas on what I'm doing wrong?
>
> The Term that the QueryParser is creating looks like this:
>
> KinoSearch::Index::Term->new( 'url', 'http://deseng.ryerson.ca/~fil' );
>
> ... but because the url field was analyzed using the English
> PolyAnalyzer at index-time, you're only going to get results if you
> search for "http", "deseng", "ryerson", "ca", or "fil". Try this as an
> experiment:
>
> $q = "url:fil";
>
> I bet you will get some hits.
>
> Marvin Humphrey
> Rectangular Research
> http://www.rectangular.com/
>
>
--
Prof. Filippo A. Salustri, Ph.D., P.Eng.
Department of Mechanical and Industrial Engineering
Ryerson University Tel: 416/979-5000 x7749
350 Victoria St. Fax: 416/979-5265
Toronto, ON email: salustri at ryerson.ca
M5B 2K3 Canada http://deseng.ryerson.ca/~fil/
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list