[KinoSearch] utf8 (unicode) any progress on TokenBatch?

Tatsuhiko Miyagawa miyagawa at gmail.com
Tue Aug 15 23:32:41 PDT 2006



On 8/15/06, Ryan Tate <lists at ryantate.com> wrote:
> I recently put together a Web aggregator scraping and parsing various
> documents of various encodings into a single summary page. I basically
> decided everything would be converted into utf8 and output as utf8.
> Along the way I discovered a number of utf8 issues in modules ranging
> from LWP::Simple to XML::Atom.

Starting from XML::Atom 0.20, it has $ForceUnicode global flag (which
defaults to 0 for backward compat.) to make it explicitly work in
Unicode mode, rather than UTF-8 bytes.

HTH.


-- 
Tatsuhiko Miyagawa

_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch




More information about the kinosearch mailing list