Open-i : OA Biomedical Image Search Engine. I tested their collection of x-rays, but found them to be pointlessly small. It appears that the items are all figures auto-extracted from CC medical papers, rather than hi-res scans.
There’s what may be the start of a flurry of long-form press publicity for Sci-Hub: “Meet the Robin Hood of Science” at Big Think, and “The Research Pirates of the Dark Web” at The Atlantic. Did they hire a good publicist, I wonder?
Also from Moscow, a new long Interview with CyberLeninka’s Chief Strategy Officer in English. It’s very long and I haven’t yet read it all that closely, but there are obviously some interesting statistics and also trenchant comments about Russian attitudes to predatory journals and to OA repositories.
“The Moscow-based CyberLeninka … reports that it currently hosts 940,000 papers from 990 journals, all of which are open access, and approximately 70% of which are available under a CC-BY licence. Significantly, it has achieved this without the support of either the Russian government, or any private venture capital… The service was created, and is maintained, by five people working from home.”
“Since ROAR indicates that CyberLeninka has just 257 records we might want to take these [ROAR] figures [on Russia] with a large dose of salt…”
Indeed. My work on GRAFT strongly suggested that the large repository directories and repo search tools are often out-of-date, and generally in need of a jolly good scrubbing. GRAFT is at least a partial ‘quick-fix’ solution to such problems, I hope, since it’s the result of a week spent in combining and thoroughly cleaning.
Not content with ruining Flickr with bloat, painfully slow loading and a clunky new UI… now Yahoo are scaling it down…
“Flickr will be scaled down, and will soon see some cutbacks in near future. … will soon be operated with minimal overhead…”
One day we’ll marvel at a Ken Burns-style documentary feature-film, which will recount exactly how gross mis-management turned Yahoo’s excellent suite of Web services into a puddle of worthless mush in just a few years.
Ho hum, here we go again… “France Wants to Ban Linking to Any Site Without Permission” and “Will Europe’s highest court now kill off hyperlinks?”
A new Ngram-based search tool for repositories, from an Australian student. openaccess.xyz is based on…
“A recent harvest of .edu, .gov, .ac and .org university websites, which I performed, produced around 16,000,000 papers. … I decided to prune a clean set of records (taking only the papers with near perfect metadata – dates, abstracts etc) and then present them in a Bookworm (the software which inspired the Google Books Ngram Viewer).”
As a keyword-based search tool it seems to give very poor results, judging by my test search for nesting bumblebees ecology. But, as an interface design for public search, it’s quite interestingly unusual.
For this early beta it might have been made made more useful by filtering the papers to make the focus much tighter. For instance, perhaps just a focus on the flora and fauna of Australasia.
IsisCB Explore is a public bibliographic search tool, from the History of Science Society and the University of Oklahoma…
“Nearly 200,000 interlinked bibliographic citations to books, chapters, articles, dissertations, and reviews from the Isis Bibliography of the History of Science 1974 to present. Annually updated.”
I’m not sure how fresh the DOI Web links are, though. The first three links I tried all proved to be broken.
A new white paper from publisher SAGE, “Expecting the Unexpected: Serendipity, Discovery, and the Scholarly Research Process”.
Serendipity is considered mainly in the context of discovery via automated content-recommendation systems, since the research (a survey and a literature review) was done in the context of the making of the new SAGE Recommends system.
So the report’s not really about serendipity in the wild frontier of academic keyword search on the open Web. There are some interesting observations, however…
“Serendipitous discovery should be of particular interest to information providers precisely because there is so little precedent; there is still tremendous scope for individual organizations to bring their own priorities and values to bear on how they recommend or otherwise help researchers discover their content.”
“If discovery is too exacting or too precise, it can end up reinforcing habits rather than exposing students and researchers to new information, sharply limiting the researcher’s view of the world of information. … We might even suggest that there is room for errors and luck in recommendation systems; a serendipitous system that does not include some element of chance is hardly serendipitous at all.”
“… based on our research, it appears that approaches to encourage serendipity that do not place the content front and centre might encounter problems.” [i.e.: academic searchers want recommendations based on the actual content, rather than on the behaviour or tastes of other system users]
“The less exciting, but equally as important, corollary to discovery is delivery, or access: providing the patron with the material once they have found it. Given that “the researcher’s discovery-to-access workflow is [already] much more difficult than it should be” (Schonfeld, 2015 $ paywall), improving discovery before solving the challenges of infrastructure and access is perhaps kicking the can down the road. This is not to say that there is no value to tools and solutions that promote discovery within an isolated silo, but their potential is limited until publishers, libraries, and discovery vendors make interoperability a priority.”
The UK’s venerable New Scientist magazine is now online and searchable at Google Books, 1956-1989.
OAPEN-UK’s final report on open access monographs, OAPEN-UK final report: A five-year study into open access monograph publishing in the humanities and social sciences.
“Many libraries will […] be providing links to the open access copies of monographs through their discovery systems, but librarians are not always aware of this. A minority are also reluctant to include open access content within their catalogues.”
“30% of respondents currently identify open access monographs for inclusion within their library collections – 49% do not, while 21% were unsure.” — Librarian survey for the report.
Unsure about including OA at all, or unsure if anyone on staff was identifying OA items?
“There are also large numbers of researchers – especially early career and retired academics – who do extremely valuable research which deserves publication but who work outside academic institutions. Changing publishing culture in a way that affected these researchers negatively would damage the overall discipline.”
20,000 hi-res British historical artworks in the public domain, newly online from the Yale Center for British Art. My test downloads freely yeiled up hi-res .TIF files, with only a simple numerical “captcha” to fill before each download.
Size seems to be 3000px at 300dpi, around 18Mb to 20Mb, which is suitable for a magazine double-page spread. There are often lengthy curatorial commentaries, though these are presumably not in the public domain.
Re-finding seems impossible, with the numbers embedded in each file name being unknown to either Google or the Yale Center for British Art website search, so one would do well to save a PDF of each picture’s page along with its hi-res file. Or the simple .txt of the caption, which excludes the curatorial commentary, though the picture’s given title is not always clearly indicated in this.
“The French Lady in London”, c. 1771. French hairdressers residing in London “were often singled out for particular opprobrium”, and their extraordinarily elaborate creations in women’s hairstyles were the subject of caricature in the press.
The new New York Public Library public domain scans website seems to have sorted out its launch difficulties. Visitors can now sort-of enjoy everything from enormous numbers of complete sets of old cigarette cards to old photography of New York City.
I say ‘sort-of’ because I found that every test item I tried was capped at 720px, and the hi-res versions are only obtainable on payment. The site’s image URL path is also hidden from Google, so one can’t use Google Images to find just the free hi-res versions, if there are any.
Overall, despite its very large scope and quality selection, the huge paywall means that it’s hardly the most exemplary presentation of public domain material.
Byron Russell, manager of Ingentaconnect, wants to search only for freely re-usable Open Access articles, but finds that ‘the Google moment’ for such a search hasn’t arrived yet…
Run a Google search on “Mendelian dominance open access” and the first two hits are for one publisher – the OMICS Group.
Judging from my Google Search results to recreate his search, what he actually tried to search for was: Mendelian dominance open access — without the quote marks. Difficult to see how such a loose search would find something worth having. But even if he’d then gone on to say… ‘so, we need to teach students how to search Google properly…’, his article’s point would have been much the same. Even using sophisticated Google search methods, one still gets mired amid a swamp of Powerpoints, K-12 lesson plans, student quizzes, wikis, high-ranking predatory journal articles and other junk.
JURN does a fairly good job with…
Mendel “dominance” “Commons Attribution” -noncommercial
Having Mendel without quote marks in that way, catches Mendel | Mendel’s | Mendelian | since Google automatically expands the name.
The target CC content, as currently found on OA journals via JURN, seems to reside almost entirely in PLOS, Pubmed, Springer and a few others.
But there’s more in the hybrid journals. So one can also approximate a main Google Search across the large publishers, Elsevier for instance, via something like…
site:www.sciencedirect.com/science/article/ “Commons Attribution” -noncommercial -“non-commercial”
For Oxford Journals it’s slightly different…
inurl:oxfordjournals.org “Commons Attribution” -“non-commercial”
(Google will probably flash up an annoying “captcha” to make sure you’re not a robot, at that point, if you’ve worked the examples down to this point).
And so on… one could just work through the larger publishers that way. For Springer most of the work has already been done by Paperity, although Paperity still lacks coverage of a couple of OA Springer titles.
It’s certainly not ideal, as Russell suggests. On the other hand, one might ask why someone needs to find just the CC-BY content on a topic. Perhaps it’s actually quite useful that a big publisher would find it difficult to automatically siphon all known CC-BY articles and books into its own giant repository, slap on some search, mining, overlay journal and themed book-compiling tools, and then sell access to it.
NESTA’s Electricomics research project now has its final report available. Lots of market segmentation and reader demographics, at the back…
The Oberlin Group — a large consortium of the top-ranked liberal arts colleges across the USA — is pledging $1 million to Lever Press to “publish 60 new [platinum] open-access titles by the end of 2020”, presumably journals.
The New York Public Library has released scans of 180,000 public domain items in their collections. “No permission required [for use]. No restrictions on use… go forth and re-use!” Nice, but actual access is currently rather clunky. Seemingly only via a gee-whizzy visualization tool that never stops its “Loading…”, so I couldn’t actually test what they have on offer.
Update: low-res only, access to hi-res versions is paid.
Boeing has sponsored a new 4,500 issue free archive of Aviation Week & Space Technology.
Beall’s List of Predatory Publishers 2016. An update on the now-rolling main list, and links to newer lists for hijacked journals and for dubious purveyors of metrics.
DuckDuckGo’s excellent Image Search has added a size filter…