[KinoSearch] PolyFilter and Plans

Marvin Humphrey marvin at rectangular.com
Tue Apr 3 14:25:51 PDT 2007




On Apr 3, 2007, at 9:10 AM, Chris Nandor wrote:

> We need to cache the BitVector of a Filter on a per-reader basis.   
> That
> BitVector should be destroyed when the Reader is.
>
> A Filter generally does not change, but it can, particularly in the  
> case of
> a PolyFilter.  When the Filter changes, the BitVector should no  
> longer be
> used, and, ideally, it should be cleaned up, as it is not likely to  
> be used
> again.
>
>
> That's basically it.  Then there's implementation.

Well stated.

> [NB: we still need a way to generate hash_code for PolyFilter.]

That could be as simple as adding the hashcodes of all component  
filters, plus some extra factor for each logic.

> We could also use weak references.  I generally think weak refs are  
> a hack,
> but then again, the above is also turning into a hack.  :)

Weak refs are a hack, but sometimes they enable elegant solutions.   
Tell me what you think of the following, which doesn't even involve  
IndexReader:

# Filter.pm

# Store a cached BitVector associated with a particular reader.   
Store a weak
# reference to the Reader as an indicator of cache validity.
sub store_cached_bits {
     my ( $self, $reader, $bits ) = @_;
     my $pair = { reader => $reader, bits => $bits };
     weaken( $pair->{reader} );
     $self->{cached_bits}{$reader} = $pair;
}

# Retrived a cached BitVector associated with a particular reader.   
As a side
# effect, clear away any BitVectors which are no longer valid because  
their
# readers have gone away.
sub fetch_cached_bits {
     my ( $self, $reader ) = @_;
     my $cached_bits = $self->{cached_bits};

     # sweep
     while ( my ( $stringified, $pair ) = each %$cached_bits ) {
         # if weak ref has decomposed into undef, reader is gone...  
so delete
         next if defined $pair->{reader};
         delete $cached_bits->{$stringified};
     }

     # fetch
     my $pair = $cached_bits->{$reader};
     return $pair->{bits} if defined $pair;
     return;
}

> However, apart from creating a dispose() method, we would also need to
> uniquely identify the *Reader*, instead of the Filter, unless we  
> think the
> current method of using a stringified reference is sufficient.

Stringification is sufficient for the above, because the undef-ness  
of the decomposed weak reference protects us against a second reader  
impersonating one that's dead and gone.

FWIW, we might consider adding this to KinoSearch::Util::Class:

   sub hash_code { refaddr(shift) }

Then we could use $reader->hash_code in place of "$reader", above.   
But it doesn't really matter for this particular application.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/



_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch




More information about the kinosearch mailing list