Google has announced a new anxiety stress-test. No, it’s not one of their infernal ‘captchas’ on Google Search, guaranteed to send stress levels through the roof. This is a post-lockdown “seven-question survey” about one’s personal anxiety, developed with the National Alliance on Mental Illness. The Googlebots will keep the results private and locked down forever, or so they promise. But I’d be a little worried about that.
Verizon OneSearch, a new Google competitor which appears to have been launched in February. Initially reported as being Bing-powered.
I had strange results. When first used it, it showed stuff I know Bing doesn’t index. I know what Bing (and by extension, DuckDuckGo) looks like for certain searches, and the OneSearch results were nothing like what I saw. I saw the exact results I’d expect from Google Search.
But on a second try with the same search 30 minutes later, the same search results reverted to looking exactly like the mediocre Bing / DuckDuckGo. Very curious.
Anyway, also of note is its Image search, with Google Images-like filters for size and CC licence…
The big drawback to everyday use is its curious 30 seconds of complete unresponsiveness, which seems to kick in every few minutes. When it’s actually responding, it’s fast.
The Brazilian SciELO is updating its inclusion criteria. There are of course half a dozen other SciELO aggregators around South and Central America, but Brazil’s is the biggest.
According to the English-language summary, to stay in after 2020 the indexed journal must…
* accept for consideration articles “that are posted in a preprint server”
* be “citing and referencing all data, software codes and other content underlying the article’s texts”
* have in place “options for opening peer review”
No. 1 sounds good, and might be usefully extended to blog posts that included part of what later became an article.
No. 2 may be a bit problematic for those who rely on big closed computer-models, but I guess that simply “citing” that the model (presumably) exists may be deemed good enough. But it would be nice if SciELO required that the link should always lead to the full public data or model.
No. 3 appears to leave “opening” curiously unspecified. What options are acceptable, and by what criteria will a journal be judged to have engaged “opening” its peer review? And how will this impact perfectly valid small single-editor journals in the arts and humanities? In which, for instance, the editor is the world-expert on the niche topic and single-handedly does a ‘light-touch’ peer-review on the year’s articles? Will they be forced to take on a new Peer Review Board, and then run and chase it, or else leave the Brazilian SciELO?
AnswerThePublic.com is an interesting new search tool. Instead of searching for answers, it tries to pick up the questions being asked about products. It promises “a direct line to your customers’ thoughts”, or at least those customer-users who are savvy enough and non-expert enough to pose a well-formed question in the right place.
The searcher seems to be limited to three searches per day, after which your time is up and you’re shown this slice of cheese…
The results format is quite elegantly graphical and useful, though I can’t screenshot and discuss these here because… I’ve had my three searches. There was some kind of linkback to Google Search on some of the results tabs, which seemed to make it even more useful.
Google Research has launched COVID-19 Research Explorer. This has “a semantic search interface” that enables better search and discovery across “more than 50,000 journal articles and preprints”.
“SaveDotOrg campaign succeeds, as ICANN rejects .ORG sale”. Excellent news for .org site owners. A sale could have led to a scenario in which a new owner would have been able to loot and pillage .org, by drastically hiking up everyone’s registration fees. That’s off the cards for the moment, but The Internet Society still wants to find a new “faithful owner” in due course. One who can offer “protection against censorship and financial exploitation” for .org sites.
The UK government has just announced that “Plans to scrap VAT on e-publications have been fast-tracked, and will come into force tomorrow”. VAT is the UK’s main sales tax. This should mean cheaper lockdown e-reading and research — so long as publishers and Amazon don’t just keep prices the same and pocket the 20% as extra profit. The move covers “e-books, e-newspapers, e-magazines and academic e-journals”, but seemingly not audiobooks. The change will be permanent, and had been scheduled for December 2020.
The UK government will also spend £35 million in taking out ‘public education’ print ads in newspapers, over the next three months. This will be “split between local, regional and national print media”, with what appears to be a strong tilt toward what the government calls the “most-trusted” print newspapers. This may imply that the shoddy, slipshod and alarmist reporting we’ve seen could be about to have financial consequences for newspapers.
Martin Paul Eve has a new post on Zotero and auto-downloading open access books…
all I really wanted was to be able to embed an ISBN and a citation_pdf_url and have Zotero do the lookup and save the file. However, out of the box there is no easy way to do this.
His test book is quite interesting, his own new Close Reading with Computers: Textual Scholarship, Computational Formalism, and David Mitchell’s Cloud Atlas (April 2020), which applies textual computing to the science-fiction-philosophy novel Cloud Atlas.
I don’t know about or use the current version of Zotero, so I’m unsure what advantages it confers. I assume Eve intended to find a way to automatically harvest all CC-SA books in PDF, and build a local collection for automated analysis.
But I see his book is already on the OA book aggregator catalogue OAPEN. Theoretically then, since OAPEN is comprehensive and timely, one could have a harvester look at all the pages hanging off library.oapen.org/handle/ and save out only those pages with the required permissive CC “Rights” label on them. These pages each have a uniform PDF link URL in their HTML, in the form of library.oapen.org/bitstream/ and these could be easily extracted to a list. One would end up with a set of PDF links for a linkbot, ready to download to a local folder for computational analysis. I presume that’s what Eve intended to have Zotero do.
One would need to reference the OAPEN record page first, in the way I’ve suggested, since the PDF itself can have different or non-uniform or contradictory licence information. For instance in its interior Eve’s book is labelled as both “©” … “No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, or in any information storage or retrieval system without the prior written permission of Stanford University Press.” and also “Creative Commons Attribution-ShareAlike 4.0”.
How many items on OAPEN have a creativecommons.org/licenses/by-sa/ “Rights” label at present, as Martin’s book does? A Google site: search suggests around 650 titles. Half an hour of my filtering the OAPEN CSV suggests it’s actually just over 3,000 under some form of permissive CC that permits commercial use. That’s still a manageable harvest at present. But as the supply of OA books and monographs grows rapidly, the likely result of various OA mandates in the near-future, it might be a useful time-saver for text-miners and digital humanists if OAPEN were to maintain a single torrent of all the PDFs. Inside which a half dozen folders would neatly organise the books by CC licence type. Such a one-click solution might save a lot of faffing around with digging into and filtering their XML and CSV feeds, wrangling with harvester scripts and timeouts, or trying to wrestle with third-party services such as Zotero. A torrent could also save OAPEN’s bandwidth.
“On the Persistence of Persistent Identifiers of the Scholarly Web” is a new paper from Los Alamos, finding that many DOIs in a 10,000 random sample are unreachable…
“consistently across request methods, more than half of our DOIs fail to successfully resolve to a target resource”
Despite the misleading “2004” tag on the page identifier tag, the paper was actually presented in March 2020 at the CNI Spring 2020 Project Briefings.