What Google Search has to filter out every day… demonstrated by a sample search for Lovecraft (as in the author “H.P. Lovecraft”) for just the last 24 hours. About ten valid hits, which Google has usefully brought to the top of the results — but also thousands of malware-laced fake ebook PDFs and spam pages offering ‘essay writing’ services (of the sort that claim to help students cheat their way through university).
Amazon’s search is becoming more and more useless. Search for “middle-earth” in Books: Reference, and half way through the second page of results, it starts slipping in multiple “middle-east” results. Presumably on the assumption that the searcher is a drooling idiot who’s mis-typed the search query.
80% of JURN’s entire URL list has now been checked for continuing presence of the URL path on Google Search. I check the specific URL path being indexed, and not just the basic domain (e.g.: for ITJ: The Intel Technology Journal , http://www.intel.com/content/www/us/en/research/ rather than http://www.intel.com). Broken URLs are being found/fixed or deleted as required.
The JURN Directory has been link-checked and updated.
Half of JURN’s entire URL base has now been checked, looking for the continuing presence of the URLs in Google Search results. A broken URL path is usually re-found and fixed, but is sometimes deleted where the site is “404” or where the archives have vanished (e.g. the Royal Navy’s Naval Review journal).
African universities often have better access to journal databases than western counterparts, thanks to big aid deals for the continent, but I wondered if Pakistan has a similar full-range access. I had a quick initial look at the journal-access situation in Pakistan, and soon found the national HEC Digital Library and its list of included databases and publishers…
“HEC National Digital Library (DL) is a[n official national] programme to provide researchers within public and private universities in Pakistan and non-profit research and development organizations with access to international scholarly literature based on electronic (online) delivery, providing access to high quality, peer-reviewed journals, databases, articles and e-Books across a wide range of disciplines.”
The supplied databases look like a wide selection and are available to bona fide institutions in Pakistan, though it looks like there’s a certain subset of databases reserved for larger institutions only.
Access there looks like it is broadly comparable to a medium-sized university in the west, if “The impact of non-accessible library and information science journals on research productivity in Pakistan” in 2016 in anything to go by. It found, from Pakistan…
“18% non-accessible and 37% partially accessible LIS journals on the HEC subscribed databases.”
Thought I note that, since then, Pakistan’s HEC Digital Library has added Gale, Oxford, Proquest, and probably others. Which has likely shortened the gap.
A 2009 grassroots report found that the main problem in access was said to be due to the frequent power-cuts, rather than databases…
“the respondents emphasized that electricity failure is the main hindrance to access to the digital library and to the Internet”
It’s time for another in my series of group tests. The aim here is to test public tools used for keyword searching across open access (or otherwise free) academic papers, theses and/or books. The last such test was in December 2015, so it’s been over two years since the last one.
I decided to re-visit the humanities, combining history and literature with the search: Shakespeare “sonnet 71” sources. It’s not a sophisticated search, but rather the sort that a somewhat uncertain undergraduate might initiate.
‘Shakespeare’ is a big enough topic to give a range of results and test relevance ranking, but also tests if the engine can distinguish between Shakespeare’s Sonnet 71 and Sir Philip Sidney’s Sonnet 71 (ideally the search tool knows: ‘if searching for Shakespeare, user does not want results for Sidney’). Adding “sonnet 71” to the search query is specific enough to tell how many spurious links are being bundled with results (i.e.: ‘includes mention of a sonnet, but ermm… not 71’). Adding ‘sources’ as a kicker is intended to give semantics modules a slight challenge, because it might be that the searcher is looking for: i) the church and legal influences on the sonnet; or ii) is seeking articles on some of the many later creatives who have used the famous “sonnet 71” as their source. So it’s a tricksy search, even though it might appear straightforward.
I’ve included some new search tools in this test, new since the last group test in December 2015. The additions are the Chinese National Science Library’s GoOA; Dimensions; dissemin; the newly public 1findr (formerly oaFindr); and Elsevier’s ScienceOpen Search.
EconBiz was also included for completeness, even though it’s a business studies and economics search tool. It seems that Timothy McCallum’s nGram-based openaccess.xyz has been retired, and he’s now a blockchain consultant. The ACI Scholarly Blog Index has also been retired.
Web browser script-blockers and similar were turned off for the tests.
|JURN group test: Shakespeare “sonnet 71” sources
July 2018. Searching for free full-text academic articles, book chapters, dissertations/theses or other substantial content in English. I clicked through on possible results and evaluated.
|OpenDOAR||–||Appears to no longer offer full-text search across repositories? A version 2.0 is now online, and I guess we may see the full-text search capability return in future? In the meanwhile, your alternative is Graft.|
|EconBiz||0||Zero from zero results. All fields, searched for ‘Open Access Only’. To be fair, I should note that EconBiz is only meant for business and economy search.|
|GoOA||0||Used the Chinese-language interface, toggled it to a ‘keyword’ search. Zero from among a slew of hard science results.|
|Q-Sensei||0||0 from zero results. Got the message “Sorry an error has occured”. Tried several other browsers, with the same result. Defunct?|
|WorldWideScience||0||No results. Possibly due to a cause indicated by the message: “Adobe Flash Player is missing”. I wasn’t willing to compromise my security by installing Flash, so this source went untested. Why would anyone need a legacy plugin like Flash just to serve search results?|
|JournalSeek||0||Zero results, from four results. To be fair I should mention that JournalSeek is focussed on science and is meant to find journals themselves, rather than their articles.|
|DOAJ||0||Used ‘Article’ search. 0 from zero results. A simpler Shakespeare sonnet sources also had no results. I had to cut right back to just Shakespeare sonnet to get even 14 results, which were on-topic but not relevant to the original search for Sonnet 71.|
|dissemin||0||0 from zero results.|
|JournalTOCS||0||Search by keyword. Zero from zero results.|
|Paperity||0||Sorted by Relevance (default). Checked first 30 results. 12 appeared to be generally relevant to Shakespeare. Four short and straightforward stage-play reviews, and one spurious result, were discounted. Of the remaining, three were about using Shakespeare to teach maths, another was a tangential book review on posthumanism. Of the rest, the PDFs were downloaded, and a search for ‘sonnet’ in each PDF gave zero results.|
|British Library EThOS||0||0 from zero results.|
|OpenAIRE||0||0 from zero results.|
|Microsoft Academic||0||0 from two results. One result was on Sir Philip Sidney (who also wrote a Sonnet 71), the other on John Benson. Both were on Project Muse, and both were paywalled with a “Purchase from JHU Press” message.|
|Ingenta Connect||0||Zero from zero results. Searched by ‘Article title, keywords or abstract’ + ‘OA only’, and also tried ‘All free’. ‘All’ also produced zero results.|
|Google News||0||Google News can be surprisingly useful for triangulating contemporary aspects of one’s search topic, by surfacing notices of exhibitions, conferences, projects, obituaries and publicity for new books. But for this search it could only inform me that a range of the key manuscripts of British literature had been on show in China in the spring of 2018, along with relics and books used by the first Chinese translators of Shakespeare.|
|ScienceOpen Search||0||Zero relevant results. Removed the “in last week” filter, and switched to ‘Relevance’ ranking of results. Checked first 30 results. 13 were in some way relevant to Shakespeare, though with a distinct slant toward medical topics. Top results included: a comedic faux-managerial ‘assessment’ of The Globe Theatre; a note from the Indian Medical Gazette of 1877; a half-page review on the Ulster Medical Journal of 1959; a medical student’s mock-Shakespearean comedy skit from 1989. A later result, “Searching for Shakespeare in the Stars” examined the science knowledge evidence in Shakespeare for clues to the ‘real’ authorship, but did not discuss Sonnet 71. Other results were short book reviews in medical journals. A late result was “Automatic Compositor Attribution in the First Folio of Shakespeare” on an automated bibliographic detection process. “Did William Shakespeare and Thomas Kyd Write Edward III?” took a digital humanities approach to Edward III. “Language Individuation and Marker Words: Shakespeare and His Maxwell’s Demon” was another digital humanities project, sweeping across words plucked from “168 plays from the Shakespearean era”.|
|Mendeley||0||Searched ‘Papers’ only. Zero results. Tried again with ‘sonnet’ rather than ‘sonnet 71’. Then had two results, one on Milton, and the other rather interesting: “Shakespeare, plants, and chemical analysis of early 17th century clay ‘tobacco’ pipes from Europe”. Apparently cannabis and coca leaves could be detected! Though full-text and under full CC, this sadly proved to be a one-page ‘Correspondence’ letter and had no mention of the Sonnets.|
|CORE||0||Zero from 30 results. Filtered by “full-text only”. Looked at the first 30 results. CORE’s semantics module is obviously still rather primitive, but 23 results were generally relevant to Shakespeare in some way. One possible item, “Searching for Shakespeare in the Stars”, had already been found and discounted. A half dozen classroom items from Education Studies were discounted e.g. “Shakespeare for all ages and stages”, “Introductory guidance for teachers”, etc. Six likely PDFs were downloaded. None were relevant, though the article “Small Latine and Lesse Greeke? Shakespeare and the Classical Tradition” was of some background interest.|
|Digital Commons Network (BePress)||0||3 results for the first pass, none seeming to be useful. I tried again with simply ‘sonnet’ and then filtered results for ‘English Language and Literature’. Looked at first 30 results, and investigated the undergraduate dissertation “The Bard and The Word: the influence of the Bible on the writings of William Shakespeare” and the paper “Missing Shakespeare’s Law: Some Writing about Some Reading about Close Reading”. Neither proved directly useful, though hinted toward Church language and legal phrasing as possible influences on some of the Sonnets.|
|OAlib||0||Zero from 30 checked results. First three results looked good, actually being about Shakespeare’s sonnets (though not Sonnet 71). Results came in tens and by the second and third page wildly off-topic biomedical results were creeping in, but the other results were broadly on-topic for Shakespeare. 12 likely results were further investigated. But… the top result went to ojs.academypublisher.com, a domain now ‘404’ and for sale. The second and third results went to “Internal Server Error”. So much for the promising first three results. The majority of the other papers were about later receptions and re-workings of Shakespeare plays rather than poetry. A short 2003 article “A Few Hints To Approach Shakespeare’s Works” in Literatura y Linguistica No. 14 would have provided a very useful short grounding for a sixth-former or undergraduate new to Shakespeare, but was not specifically relevant to the search.|
|BASE||0||Zero results, so I tried again with just ‘Sonnet’ rather than ‘Sonnet 71’. I then filtered results by ‘Open Access’ and checked the first 30 results. Relevance seemed to be excellent at first glance, though translation featured strongly in later results and by the third page the relevance had tricked away. Six likely candidates were chosen for testing. The first was blocked with “Embargoed Content”. A possibly interesting discussion of Shakespeare’s sonnets 78–86 was blocked by a paywall at Oxford Academic. “L’art de Shakespeare dans les Sonnets” proved to be in French. The DOI’s at 10.1093/nq/17.4.132-b and 10.1093/nq/192.25.548-g were both ‘404’ dead. The final link claimed be a PDF archive of a page of Web links on ‘Social Issues for High School students’ in relation to Shakespeare, but failed to be what it claimed.|
|OATD||0||Zero results. Tried again with just ‘sonnets’ and had three results, only one of which was relevant, the undergraduate B.A. dissertation “Metaphors of Time : Mortality and Transience in Shakespeare’s Sonnets” which looked specifically at Sonnets 60, 64 and 65. But this dissertation seemed very short (abridged?), and though it tried to present some background and concepts it could not be called a valid ‘hit’.|
|SciLit||0||Zero results, so I tried again with just ‘sonnet’. Filtered by ‘Open Access’, giving three results, none relevant.|
|NDLtd||0||Zero results, checked first 30 results. Some off-topic science articles (by science researchers with the surname Shakespeare), but otherwise broadly relevant to Shakespeare and his sources. Five likely items were examined. The book Le Vers de Shakespeare looked possible, if in French, but was anyway found not to be available in full-text. Shakespeare’s books: a dissertation on Shakespeare’s reading and the immediate sources of his works (1904) had a link which went to Archive.org. After making a search of this book, the searcher would have had no luck — but would have been made aware of the need to also search for Roman numerals (e.g. ‘Sonnet XXI’) in pre-1930s material on Shakespeare.|
|1findr (formerly oaFindr)||1||Zero results. When just ‘sonnet’ was used, there were 13 search results in Open Access. Relevance was good, with “‘Subject to Invent’: Adaptations of Shakespeare’s Sonnets into other Media” being top, and this offered a stimulating overview that would be useful enough (to a certain type of artist-creative searcher, looking for past examples of adaptations) be counted as a ‘hit’. The article “A Previously Unreported Source for Shakespeare’s Sonnet 56” proved to be paywalled.|
|Dimensions||1||1 from three results. Filtered for ‘Open Access’ only. One article, from the International Journal of Applied Linguistics and English Literature, had a passing remark on Sonnet 71.|
|SHARE||1||Sorted by Relevance | Publication. Top five results were to the text of Sonnet 71 itself, but, on clicking through: “Users without a subscription are not able to see the full content”. Otherwise relevancy was strong. Looked at first 30 results. There was some free material, a couple which claimed to be on JSTOR Free (if you sign up). Of these, a review of “Shakespeare’s Books: A Dictionary of Shakespeare Sources” and a review of “The Foreign Sources of Shakespeare’s Works. An Annotated Bibliography of the Commentary written on this Subject between 1904 and 1940” might have provided useful references. The 1957 book Narrative and Dramatic Sources of Shakespeare was declared un-purchasable, but had a preview PDF with a possibly useful bibliography and index. I lumped these three marginal results together and feeling generous counted them as a combined “1” score. The DOI link to the book “A Shakespeare Reader: Sources and Criticism” was ‘404 not found’ at Springer. “Breath, Today: Celan’s Translation of Shakespeare’s Sonnet 71” looked very interesting, but was firmly paywalled. “Scoping Shakespeare, costume and performance; approaches, sources and interpretations” was a nice free backgrounder article, but not relevant. Overall, SHARE seems worth a look when doing a search. The relevancy was good, and there was some genuinely free content (for those prepared to dig past broken DOIs and suchlike).|
|PQDT Open||1||Sorted by relevancy. The first 30 results were all broadly on-topic for Shakespeare. Top results had a focus on the comedies, youth culture and contemporary staging, all variously reflecting the fact that that PQDT is strong on searches of undergraduate dissertations. “Couplets and sententia in Shakespeare’s sonnets” had some discussion of Sonnet 71, just enough to count as a hit.|
|FreeFullPDF||5||The original search query gave no results, and so Shakespeare “sonnet 71” proved the only feasible search here, giving 49 results. FreeFullPDF is known to include the likes of academypublication.com, and users need to be wary. Though in this case the said ‘publisher’ did manage to offer the reasonably on-topic and highly polished “The Tragic Vision in the Fair Youth Group in Shakespeare’s Sonnets”, which ranked highly. “First and Final Things: Shakespeare’s Sonnet 145, and his Epitaph” discusses Sonnet 71 in the context of the poet’s epitaph and Stratford-upon-Avon and could have nudged a student toward a useful bit of topographical grounding for their essay. “Shakespeare’s Sonnets in Russian” discusses a prose translation of 71. Confusion with Sir Philip Sidney’s Sonnet 71 added a couple of spurious results. “The cognitive poetics of literary resonance” appeared again, and having been previously seen I knew it had a useful commentary on 71. “Echoes of Desire” was a Cornell University book liberated by the National Endowment for the Humanities, and proved to offer some slight comment on “Shakespeare’s Sonnet 71” (page 128) amid a general overview of Petrarch and “Petrarchan elements in the sonnets”. The article “Autor des Sonnets” appeared at first to be French but proved to be English and to have some perceptive and quotable comments on Sonnet 71: “You can hear the bell all along these lines [in 71]. A sonnet is a bell ringing. Once the bell is in motion you cannot make it stop. Slow it down only. It has to exhaust itself in time.” The open book Love and its Critics: From the Song of Songs to Shakespeare and Milton’s Eden has mention of a Sonnet 71 — but it proved to be Sir Philip Sidney’s Sonnet 71.|
|Google Scholar||7||I removed citations and patents from search results. There is no way to filter Scholar for ‘Open Access Only’, so the first 100 results were examined. This gave Scholar a good chance to pop out some OA links. “The cognitive poetics of literary resonance” was available in full-text from Academia.org and had three pages of detailed discussion of Sonnet 71. “”71″ by William Shakespeare” was also in full-text, though proved to be a very short and under-researched undergraduate term-paper. “Teaching Shakespeare’s Sonnets: Time as Fracture in Sonnets 18, 60, 73” was full-text, on-topic in theme and had some passing discussion of 71. “A Light in the Darkness: Imagery of Light in Shakespeare’s Sonnets” was a good Japanese paper, but had only the briefest mention of 71. “”Couplets and sententia in Shakespeare’s sonnets” reappeared again (it had been seen before), and was counted as a marginal ‘hit’. A couple of later papers were on translations, which also revealed that the Turkish version has been successfully set to music. “Music In Shakespeare” has some mention of Sonnet 71 in the context of Arthurian tales of courtly love, and presents the idea that this sonnet might have originally been been imagined as being sung to music. (However, that article was on the Shakespeare Oxford Fellowship website, which champions the fringe idea that Shakespeare was not the author of his works. As such, it isn’t indexed by JURN). Then way down on page 10 of the results there was “Shuffling off this mortal coil. A Shakespearean perspective on death and dying” at ncbi.nlm.nih.gov. Given the subject matter of Sonnet 71 this would be very useful background context for an undergraduate. All in all, seven useful ‘hits’.|
|JURN||10||Looked at first 30 results, sorted for ‘Relevance’. The top result “Shakespeare and Philosophical Criticism” had a long discussion of Sonnet 71. The next two were on French and Russian translations, each with a mention of 71. “In Sleep a King” (chapter 5 of a monograph) has many pages of high-quality discussion on 71, including sources which might offer “a vein of medieval Christian teaching and belief in this sonnet” and a discussion of possible influence from legal terminology on the sonnet. “Teaching Shakespeare’s Sonnets: Time as Fracture in Sonnets 18, 60, 73” reappeared again, having been seen previously. There was a duplicate of the top result, “Shakespeare and Philosophical Criticism”, in the form of the entire book which held it. Various other ‘hits’, already seen and checked, appeared.|
While not offering the prettiest-looking set of search results, JURN offered the highest quality and most sustained discussions of Sonnet 71, and these ranked highly. The main problems with this set of results in JURN were:
1) usefulness (or not) was sometimes obscured behind vague link-titles (e.g. Springer’s free-sample chapters from books tend to have the plain and un-alluring link-title of “Copyrighted material”. Snippets of the full-text do give useful additional hints on these, but the snippets can mislead if part of an index.
2) despite Google’s semantic and name-authority prowess, everyone’s favourite search behemoth confused Shakespeare’s Sonnet 71 and Sidney’s Sonnet 71 rather too many times for my liking. This could, of course, be solved with a -“Sidney” search command and other adjustments of the search terms.
3) I would have liked to see the article “Shuffling off this mortal coil. A Shakespearean perspective on death and dying” in the top 30. It’s indexed in JURN but it’s not surfaced by this particular search. Even Google Scholar, however, shoved it down to page 10 of their results. Perhaps there’s something in the algorithms that says ‘ncbi.nlm.nih.gov = science and biomedical, do not mix with humanities results’. “First and Final Things: Shakespeare’s Sonnet 145, and his Epitaph” would have also been nice to see, for its discussion of 71, but I’m guessing that the obscure Italian siba-ese.unisalento.it server is poorly ranked.
JURN could have had a couple more reasonably on-topic results, if I had indexed: i) a questionable publisher; and ii) a fringe website which champions the idea that Shakespeare was not the author of his works. But that would be to dilute the results with the sort of material that an undergraduate would not be savvy enough to be wary of.
The Directory of 3,000 arts & humanities journals in JURN can now be had on this blog as a saved PDF, in its latest version (currently 20th May 2018). Those who disliked the scripted “bouncy-puppy” effect will be pleased to know the Directory’s sections are now fully expanded.
It’s been saved from the HTML as a PDF for A2 paper in landscape format, to accommodate the HTML wide-screen layout, although I doubt anyone will want to print it out at that size. Future updates will be versioned, with number and date added to the file-name.
Note that Microsoft Reader can’t seem to handle external Web links in PDFs. Adobe Acrobat, SumatraPDF etc work fine and offer clickable Web links. Some side-scrolling may be needed, if you have the likes of SumatraPDF hard-coded for magazine reading with ‘cover page + double-page spreads’.
New to me: Google Translate now works on foreign-language PDFs. Perhaps it’s been available for a while, but I’ve seen no-one blogging about it.
It doesn’t work if you just right-click on the Web link to the PDF in, say, Google Scholar or JURN search results, and then select “Translate this page…”.
Instead you have to:
1) Right-click, and copy to the clipboard the direct PDF link.
2) Visit Google Translate, manually paste in the URL you just copied.
3) Click on the URL that appears over in the facing box.
4) The PDF text appears extracted, in the form of a Web page, and translated.
Very useful, and I had excellent results with a Polish article I tested. I had the whole article translated, too, not just the first few paragraphs. Longer items such as a PhD thesis will be refused as “too long”.
Note that a ‘redirect URL’, which gives the PDF but hides the direct URL link to the PDF, is of no use in the above workflow.
Sadly I guess it’s also a route to plagiarism for students. I’d suggest that the anti-plagiarism detector-bot services might usefully build a bank of Google-translated theses and dissertations, to add to their phrase-detection sources. Teachers who mark suspiciously-excellent final dissertations, and who are then inclined ‘to go on the hunt’, should also be aware of the possibility that the lacklustre student may have run a foreign dissertation through Google Translate and then lightly re-written it for clarity in English.
GRAFT has just had another tranche of new URLs added to its index. Now searching across 4,640 university repositories, full-text and records alike.