Open Semantic Desktop Search an “open source desktop search engine for full text search in documents”, that runs in SOLR on the Windows desktop through Oracle’s free VM VirtualBox. It’s been around since late 2015, and is actively being developed, but they obviously don’t employ a publicist to promote it.

It has a clean Web-like interface, supports the indexing of a great many file-types including .ePUB and .PDF files, even if they’re inside .ZIP files. Though it can’t yet index the Kindle’s .MOBI ebook files, so you’d need to do an overnight mass-conversion to .ePUB or .PDF using the free Calibre software, and your purchased encrypted Kindle files will still need to be searched using Amazon.

Despite being run in a VM (often slow in older Windows PCs), Open Semantic Desktop Search can work on…

“old standard hardware” and “The search engine works even offline or unhosted on a single laptop without need of a intranet or internet connection or a server.”

Though online comments suggest you’ll do best with a modern PC, and those with an over-stuffed hard-drive will need to clear 50Gb of disk-space to accommodate both the software and its resulting index. The disk-space needed may be less if you’re only indexing the folder containing the .PDFs and .ePUBs needed for your PhD or book research.

I haven’t installed and tested it yet, but it’s free and looks good. Apparently it can also auto-OCR inside PDFs that don’t have OCR text, a new feature added in a December 2017 update.

The search-engine software comes packaged in a 2.8Gb .OVA file that you download. This .OVA is a plugin module for the free VM VirtualBox (a 110Mb .EXE download), and the team’s Desktop Search page has instructions on how to plug your .OVA into the installed VM. It seems fairly simple to get it up and running.

Advertisements