“PaperBot: open-source web-based search and metadata organization of scientific literature”, BMC Bioinformatics, 24th January 2019.
Seems to offer a way to swiftly and cheaply identify Open Access full-text public papers at the sites of the big publishers, even if they’re salted away in hybrid journals…
“We introduce PaperBot, a configurable, modular, open-source crawler to automatically find and efficiently index peer-reviewed publications based on periodic full-text searches across publisher web portals. [It is shown to operate across varied UIs on] a wide range of sources including Elsevier, Wiley, Springer, PubMed/PubMedCentral, Nature …”
Looks good, though so far only tested on the relatively well-behaved biomedical literature in brain science. Semantic Scholar has been doing this for a while now (and the results are in JURN), but so far as I know their crawler bot is not public.