[KinoSearch] How do you index ms office (.doc, .xls, .ppt) files with kinosearch
henka at cityweb.co.za
Mon Aug 25 06:42:32 PDT 2008
On Mon, August 25, 2008 1:12 pm, Ben Aurel wrote:
> My question is, what would you suggest for indexing office formats ?
> How do you extract text without ole and and an office installation on
> the server?
You use file conversion utilities such as pdftotext, xlhtml, wvHtml etc.
Most of these are far from perfect, sometimes crashing, etc.
KinoSearch mailing list
KinoSearch at rectangular.com
More information about the kinosearch