This guide was created by the curator of the JURN search-engine. Below is a short guide to various free search-engines and tools. Specifically, to those tools likely to lead to open material useful for UK students and researchers in the arts and humanities.
Last updated: 13th August 2020. Restored Hathi, now back up to speed again. Page last checked for link-rot: November 2017.
* “How can I search across academic repositories?”:
GRAFT searches across all known academic repositories, full-text and records alike. A beta service from JURN, and current to May 2018. Searching across 4,640 repositories, full-text and records alike.
OpenDOAR is comparable to GRAFT, though GRAFT is more up-to-date and at November 2017 covers 1,400 more repositories than OpenDOAR does (even after a very thorough cleaning and de-duplication of URLs).
National repository search services include: EThOS (UK, theses only); DART Europe (UK and Europe); RIAN (Ireland); NARCIS (Netherlands); SWEpub (Sweden); BIBSYS and NORA (Norway); DiVA portal (the Nordic nations); Helveticat (Switzerland); Aggregator (Poland); Repozitar (Czech); Croatian Digital Theses Repository (Croatia); FULIR (Croatia, advanced research); Isidore (France); OAN (Germany); PLEIADI (Italy); Recolecta and TDX (both Spain); RCAAP (Portugal); Toubkal (Morocco); openarchives.gr (Greece); BDTD (Brazil); Alicia (Peru); TROVE (Australia); AuseSearch (Australia); NZResearch.org (New Zealand, seems to be old/broken at Dec 2015); National ETD (South Africa); Collections Canada; JAIRO (Japan); OAK (Korea); and National Digital Library (Taiwan).
* “I want to search inside millions of modern non-fiction books”:
Google Book Search. (Note that Google Books cannot be accessed anonymously, by using the TOR browser — Google just throws TOR users into a never-ending loop of captchas).
Amazon ‘Search Inside’ (aka ‘Look Inside’).
* “I’m researching a historical topic, and I want to search out-of-copyright books and journals (generally pre-1926)”:
Internet Archive: Texts. Search scans of out-of-copyright books — 27,000 from Project Gutenberg, 300,000 titles from the Microsoft book-scanning project, some 300,000 English titles from the Million Book Project, and many more community uploads.
Hathi Trust. No registration is required for personal use, but whole-book downloading is not permitted for individuals.
Google Book Search will also add out-of-copyright works to the results, depending on your search phrase.
* “I already have a fair grasp of the outlines of my topic, and I want to search and freely view open access journal articles, books and theses”:
JURN has been curated for a decade, to be especially useful and comprehensive for the arts and humanities researcher.
* “I already have a fair grasp of the outlines of my topic, and I can obtain commercial academic papers and books once I know they exist”:
Journal TOCs, TOCs meaning ‘table of contents’.
* “I want to find books specifically published as open access”:
OApen is an excellent new resource that provides an online catalogue of free Open Access books produced in Europe.
* “I want to find out what current newspapers and magazines are saying about my topic”:
Current news reporting can sometimes be surprisingly useful, as it can include items such as book and exhibition reviews and obituaries. The best starting tools are Microsoft Bing News Search (don’t forget to re-sort results ‘by date’) and Google News Search. (Note that there are now two versions of Google News. The dumbed-down one with limited features and snippets. And the full one with full snippets and features like sort-by-date, that you’ll actually want to use). Yandex News appears to only cover Russian-language media.
Newspaper archives for the 20th century can be searched via Google News archive. More recent archives (back to 2003) can be searched via the Archives tab in the regular Google News.
* “Where can I find images online, relevant to my topic?”:
Europeana, a major European project that enables search across UK and European collections.
Google Images, with the ability to sort by date and licence.
Google Books Search sometimes gives you usable images on book covers / back covers, and occasionally even inside books. Use a screen-shot application to grab them from the screen. Microsoft Office OneNote also gives excellent OCR from small text.
You can do “reverse image search” by clicking on the camera icon in the search-box at Google Image Search. You can also use TinEye to do the same. Upload an image to find out where it came from, and if it’s really what it’s claimed to be.
* “Where can I search for public domain or Creative Commons licensed images?”:
Search @ Creative Commons has a search-engine that allows you to limit searches to items that have a creative commons licence.
Google Images has an option to search for Creative Commons images.
The Google Cultural Institute which offers public-domain images from the world’s great museums and galleries.
Digital Public Library of America is the official national aggregator for public domain items placed online by American libraries. It lacks image search filters for “Sort by largest downloadable size” and “Sort by CC licence”, which would make it enormously useful, but it has some potential for finding re-usable public domain images.
Flickr’s Creative Commons directory and search options. You need to be logged in to Flickr for this to work properly. You may want to speed up the horribly bloated new Flickr interface by pretending to be Internet Explorer 8.
Kalev Leetaru has uploaded 2.6 million public domain scanned pictures to Flickr, with automatic tagging.
morgueFile — free stock photography from creatives to creatives.
Geograph UK — free-to-use StreetView-like pictures of places and roads.
There are also more vaguely licensed archives, such as Google’s LIFE magazine archive, which permits ‘personal non-commercial’ use.
Use xPert to create valid academic citations (references) for Creative Commons images.
You can also extract images from HD video. In that regard, Vimeo Creative Commons search might be useful. Use a screenshot application to grab them from the screen. The free VLC Media Player can also extract frames.
Font Squirrel — for free commercial-use fonts, that are opened, installed and tested before being posted on the site.
* “Which specific museums or art collections offer large selections of public domain or Creative Commons licensed images?”:
Wellcome makes some of its historical medical images available under Creative Commons. Be sure to switch the drop-down on their search box to ‘Historic’ only. Only their historic images are CC Attribution.
The Smithsonian has 2.8m images under CC0.
The Newberry Library its 1.7m images free to re-use, including commercial.
The Metropolitan Museum of Art has images online in hi-res, and freely “available for scholarly use in any media.” They also have many more which are wholly public domain.
The Finnish National Gallery has 10,000 hi-res pictures online under CC0 Creative Commons.
The Museum of New Zealand has a lot of public domain material. Their search is however incredibly clunky and annoying, and their website often unresponsive, so search for public domain material is best done via a Google Images search with site:https://collections.tepapa.govt.nz/ “no known” keyword.
Other sites with substantial open online collections include: the John Paul Getty Museum of Los Angeles and the Los Angeles County Museum of Art; the Cleveland Art Museum and the Art Institute of Chicago; the National Gallery of Art in Washington D.C.; the Rijksmuseum in Amsterdam; the Paris Museums; and the Stadtische Gallery in Munich.
* “Images are not enough for me. I want to see and perhaps even handle the actual artefacts / object collections — are there any in the UK?”:
Cornucopia. Searches the collections in UK museums, galleries, and libraries. Funding may be removed from this service in 2010/11, reportedly.
Those who might be satisfied with digital ‘virtual’ 3D objects might also look in the Google 3D Warehouse.
Europeana, a major European project that enables search across UK and European collections, may also be useful in locating collections that can be visited.
* “Are there any maintained full-text search-engines for specific arts and humanities disciplines?”:
In the years I’ve been building JURN, I’ve only found these few: Sisyphos searches in Egyptology / Ancient Near Eastern Studies; Theological Journals Search and Biblical Studies journals cover Christian scholarship; and the Alcuin Society’s Search indexes around 150 sources on fine printing and the book arts.
* “I’ve heard about these free ‘open’ courses that some major universities are starting to put online?”:
MOOC List is about the only comprehensive search tool, at the end of 2017. Unfortunately it’s a real slog to navigate, and/or force into a “what new ‘live’ courses are starting next month?” format.
* “Is there a simple search-engine for all audio-books from all major vendors?”
Librophile is the only one that JURN’s curator knows of.
* “Are there search engines for finding Creative Commons music and audio?””
Freesound is the best site for CC audio clips.
* Is there any public alternative to Google Search, for serious online research?:
No, not really. On the desktop Google Search is still the must-use general search-engine for scoping research at 2020, despite the infuriating ‘captchas’ it will throw at serious in-depth researchers.
Bing and all engines providing versions of it (e.g. DuckDuckGo, Yahoo, Ecosia, Qwant etc) can be used for casual everyday ‘navigational’ searches, but Bing et al are quite useless for serious research purposes. Bing simply doesn’t go deep enough or wide enough.
Some suggest Startpage.com as a Google alternative, but after testing for some months you’ll realise you’re only getting a ‘cut-down Google’ version.
The only somewhat-useful fallback general search, with a Google-like scope, is the Russian search-engine Yandex which runs its own index.
There are, however two services that can be especially useful in specific ways. Yippy is basically a re-skinned Bing, but it usefully pushes answers in trusted technical forums ‘to the top’, and can do especially well for discovering technical advice — such as how best to formulate a bit of code or craft a regex. This is also said to be the case in other domains of expertise. The DuckDuckGo Image Search component (seemingly a blend of Bing and Yandex) is also quite useful for picture research, mainly because it tends to be less spammy and off-topic than Google Images, though it of course lacks various useful date/licence filters found in Google Images.
* I think what I want may only have been published in very small numbers, perhaps even distributed informally?:
Open Grey: System for Information on Grey Literature in Europe. Bibliographical references of reports and other “grey literature” produced in Europe until 2005.
Modernist Journals project. Large searchable library of art and intellectual journals in and around modernism. Mostly early 20th century.
Do you know the file-name of an old report? Then it might also be searched for via specialised file-search engines, such as Filemare as well as Google.
* I need to identify specialist blogs on my topic. Is there a good blog search tool, that runs across all blogs, and does so in a timely and comprehensive manner?
No, and that’s a huge lack. WordPress.com’s internal search is useless, in terms of not being comprehensive. Bing appears to shun most blogs, and they index high-quality blogs weakly even when they notice them and Bing is a pointless search engine for other reasons (though Bing News is good at tracking breaking news). DuckDuckGo is slightly better, but still very capricious about what it will index in the blogosphere. There are no worthwhile dedicated ‘blog search’ tools, such as the old Technorati. Only Google Search seems to have the ability to distinguish high-quality content blogs from spam and marketeering blogs, and to visit them with any kind of frequency. Even then, the choice of what gets served in your search results from the blog seems capricious. For what it’s worth I have a guide with tips on some of the best way to approximate blog search and discovery at the end of 2018.
* Are there search tools for finding open Big Data sets and statistics sets?:
Databib is a searchable directory of research data repositories.
ZanRan is a search-engine for finding Big Data & statistics.
OpenContext is a hub for archaeology data sets.
Dataverse is a ‘virtual hub’ where you can create and store a data-only repository for your university.
* I need some old and rather obscure software, dataset or assets package for my research. I have the file name, but Google is no use there. Is there a good deep file-search engine?:
Yandex. Unlike the other search-engines, it doesn’t censor results. But searchers should be sure to exercise due caution when handling .exe files from unknown places.
* Is there a free tool which lets me search through the recent and current tables-of-contents of commercial journals?:
The UK service Journal TOCs lets you search by keyword through the tables-of-contents of “20,658 journals collected from 1,373 [mostly commercial] publishers”. Seems to work best with a single keyword.
* Is there a free way of accessing and searching my nation’s newspaper archives?:
In the UK, simply by joining your local free public lending library, you should get free home access to ProQuest UK Newstand at home. UK Newstand allows you to search by keyword across the archives of national and regional newspapers, and retrieve full-text. Most archives seem to go back to around 1998.
* “Are there free art history bibliographies I can use online, without having to register?”:
Arcade (search the holdings of New York’s major art libraries).
ARTicles Online (search the journal article catalogues of the European Art Libraries Network, inc. Florence, Munich, and Rome).
Also try the Art Discovery Group Catalogue.
Possibly also useful is JISC Library Hub Discover, an aggregated search across 144 UK and Irish academic, national & specialist library catalogues.
(See also: The Getty Art & Architecture Thesaurus online)
* “Is there a good software tool to help me to store and organise my research PDFs, as I find them?”:
* “I’ve downloaded loads of PDFs to my desktop PC for my research. Now is there a way to keyword-search inside them all?”:
* “Is there a big list of all the open English language ejournals in the arts and humanities, one that links to the home pages?”:
Yes, the JURN Directory. This is link-checked and repaired regularly…
Similar directories are the DOAJ and the EZB. Although dated, Jan Szczepanski’s List is also very useful, especially for finding titles published in languages other than English and which are unlikely to be on the DOAJ.
* Is there an up-to-date directory of highbrow literary “small magazines” in the English-speaking world?”:
* “I’m an overseas student who can read academic articles in my own language — how can I access open access journal articles written in my language?”:
Brazilian: several services appear to have recently (Nov 2017) gone offline. For now, try Redalyc for coverage of Brazil and the wider Central/Latin American region.
Japanese: CiNii (includes the “NII Repository of Electronic Journals and Online Publications”) and Jairo for searching repositories for journals. Also look at the full-text ejournal archive at Journal @rchive (aka J-Stage, run by the National Institute of Informatics).
Turkish: DergiPark. Turkey appears to have a 100% open access policy for its journals.
Finland: the Journal.fi portal.
Serbian: SCIndeks is an aggregation site for 189 Serbian journals.
Mongolian: Mongolia Journals Online.
Hong Kong: Hong Kong Journals.
China: China Open Access Journals (may be unresponsive).
Welsh: Those who can read Welsh should consult Welsh Journals Online.
Another alternative global Open Access journal directory is, of course, the DOAJ. This directory usefully tries to bar or weed out questionable journal titles, and it covers all the world’s languages. Keep in mind that DOAJ only indexes current titles that are engaged in ongoing publication, excluding a journal once it ceases publication — even if the journal archives remain online.