MIT’s Mind the Gap is a new comprehensive survey report on open source publishing systems that can be used for scholarly purposes. The only one I can see that’s missing is WordPress. Which is open source, free, easy to use and rent a server for, and can be quickly tooled-up with plugins for such purposes. In fact, it’s not even mentioned once, even to explain why it and its plugins were omitted.
Open Access Week: Events listing for 22nd – 28th October 2018.
My thanks to Jan Szczepanski for alerting me that his list has a new home. Szczepanski’s List of Open Access Journals is now available via the EBSCO.com website, as a freely downloadable .DOCX Word file. The page describes this 35,000-title / 9Mb / 672-page subject-ordered list as… “probably the world’s largest list of open access journals in the humanities and social sciences” plus Law, Statistics and Geology. No “probably” about it, this fine work is surely the largest OA-only list which is also freely available to the public. The EZB is at a similar scale, and does have 37,600 OA titles since 2010, but the EZB covers all disciplines including the sciences.
I had assumed the list had long since stopped updating circa 2015 and was now a legacy item, but the new header now notes “2,932 titles added 2017”. The file’s internal datestamp says the last update in Word was made March 2018. Unfortunately there’s no “First included in (date)…” component to the list entries, by which one might extract just the new 2016-18 finds. Nor does EBSCO appear to offer a rolling “What’s New”.
Having found the list again I’ve now added a link to it on the JURN Guide / Directory / FAQ pages. Despite its inevitable 15% linkrot (see below for details on that) Jan’s list has a big advantage over the JURN Directory. In that it can serve as a discovery tool for OA titles published in languages other than English, and which are unlikely to be on the DOAJ for various legitimate reasons. In contrast, the JURN Directory only lists titles which publish in English or are genuinely multi-language with English.
The home-page URLs given in Jan’s list appear to be hyperlinked with blue-underlining, but my copy of Word won’t allow me to launch these. That’s probably just a security thing on my PC. A simple “Save as…” to a .PDF does give me launchable URLs.
One can also save the list from Word to a filtered .HTML page, which gives a 23Mb Web page with launchable URLs. Despite being so large, my Linkbot software didn’t have problems with my saved Web page version of the list. 60 minutes later the Linkbot results reported that the list has the disadvantage of link-rot:
* Contains 45,536 individual home-page URLs across approx. 35,000 titles (some records have more than one URL). Of broken…
* After discounting the useful redirects (e.g.: http:/ to https:/ or the ubiquitous OJS redirects from the home-page to the current issue page), an Excel tally told me there were 6,808 ‘fatal’ URLs such as outright 404s, ‘Unauthorized’ or ‘Host not found’. That’s a 15% rot rate.
Perhaps someone could now look at helping Jan with a crowd-funder, to pay some Web-elves (on Fivver.com or similar) or unemployed recent graduates to fix the broken 15% of URLs? At $20 per batch of 25 URLs, I figure it should cost around $6,000.
itty.bitty, new from the design leader at Dropbox. Itty.bitty uses the URL to contain the text of a Web page. The page can have 2,000 bytes, or about 170-200 words, if you’re going to support legacy Web browsers such as Internet Explorer.
No hosting server is required, and as the data sits after the # symbol. What comes after the # is meant to be page-position related, and as such it never gets sent to the server.
The base64 link code is not pretty…
But the same link/page displays as…
This link is the page.
Scripting and hyper-linking is enabled in such pages, so long as it all fits in the URL length. The code can’t do images, but you can do old-school ASCII-art.
The main drawback seems to be that you’re going to have to be 1000% sure that your text is exactly as you want it before you make the link… because there’s no after-post editing for errors or updating of dead hyperlinks in the page.
In which case you’d ideally consistently version and date-stamp the
">How it works
">How it works (v.0.1 | 14/07/2018)
…so that people and search-tools can discover later updated versions of the same content. Otherwise the itty.bitty system risks becoming an intertwingled mess of half-baked and old/broken stuff that you (and probably Google) won’t want in search results.
I’m guessing that advanced Web browsers such as Brave will soon ‘add a feature’ in relation to this, by enabling much longer data-carrier code to be read from URLs. Perhaps also some simple automatic “…and can we find a later version of this itty.bitty.site?” query, done inside the browser. There would, however, also have to be some sort of dynamic ownership hash embedded in the page, to protect against impersonation of the page-author. Perhaps the system of authoring an ownership-hash and datestamp could be combined into a simple ‘one-click operation’ in a desktop authoring tool.
Anyway, it’s one example of the coming uncensorable Decentralized Web.
“the most common mechanism for OA is not Gold, Green, or Hybrid OA, but rather an under-discussed category we dub Bronze: articles made free-to-read on the publisher website, without an explicit Open license.”
Of Bronze, “few studies have highlighted its role” [in OA]. “We manually inspected a small sample of Bronze articles in order to understand this subcategory more; we found that while many Bronze articles were Delayed OA from toll-access publishers, nearly half were hosted on journals that published 100% of content as free-to-read but were not listed on the DOAJ and did not formally license content (using CC-BY or any other license).”
Bronze was found to be at a whopping 47% of OA, from a one-week sample of Unpaywall-DOIs in 2017.
“Open for Business: Open Access Journals in Commercial Databases”, a new article (dated 27th October 2017) in The Serials Librarian journal.
‘Table 3. Presence of OAs titles in databases’. Percent of DOAJ titles found in database:
Scopus – 29.18%
Academic Search Complete – 18.00%
Web of Science – 10.91%
… between 70.82% and 89.09% of DOAJ journals are not found in the databases analyzed here, which is potentially problematic given that most researchers depend on databases to locate scholarly information
There doesn’t seem to be a portable PDF version, just a tablet-tastic Web site with no RSS feed. I don’t mind the lack of a PDF, but if they want to be in the newsfeeds of influencers then surely someone needs to plug in the RSS module.
A possible unwanted side-effect of making PhD theses open access in public repositories, if not actually Creative Commons… image libraries want hefty image reproduction fees…
“consider that your average art history PhD will have dozens, if not perhaps hundreds, of images, then soon even an unpublished PhD can become prohibitively expensive. You want to discuss mid-18th Century portraiture, and show perhaps 50 images? That’ll be £750. You want to turn that PhD into a book? £3,050 please, before you’ve even thought of printing costs. Want to put on a Hogarth exhibition, with a decent catalogue? £8,600. Ouch. And Tate [in the UK] are on the cheaper end of the scale.”
And that’s before many image libraries realise that the PhD might be made public as a PDF, and thus that their digital pictures could be extracted at print-res (pro version of Adobe Acrobat, go: Tools | Document Processing | Export All Images) and then whisked into the public domain by cackling anarchists.
But the image given in the article as an example seems to have already had something similar happen to it. It’s the Tate’s copy of “The Painter and his Pug” (£162, please… the Tate having already taken PhD PDFs in repositories into account, and gouged accordingly). The picture’s now on Wikimedia and gleefully marked as public domain.
Still, that picture is by Hogarth. If you’re writing on someone more obscure or more modern, or don’t have the time or search skills to go burrowing into Hathi and Archive.org, then I can see how the gouging ‘repository-increased’ fees could make it difficult.
And difficult not only for the hapless writer. But also for librarians. Once the PhD is in a repository and is the institution’s responsibility, one suspects that some especially viscous picture libraries may even decide to make a bundle of cash by finding ‘personal use’ images in PhDs and demanding institutional prices for their use. In which case in future might we see PhD PDFs with most of the pictures blanked out, due to a mis-match between the assumed ‘personal use, on the library-shelf only’ licence for the pictures (for instance, Google’s 10m-picture LIFE magazine archive) and the subsequent public and institutional status of the document once it hits the repository? (And does so with or without the permission of the author, increasingly). If so, who is then going to go through and censor? One suspects it’ll be too much trouble for librarians to do that by hand, and too much trouble to figure out what stays and what goes (I assume 100% reliable machine-readable rights tagging is a non-starter, due to the human author in the loop). In which case the university’s risk-averse lawyers would just recommend that some bot should automatically detect and delete all the pictures, or — as with the Digital Library of India books that I’ve seen recently — their contrast would be increased so far that the pictures become almost illegible.
One way an author might get around that is to also provide a search link with keywords and phrases embedded in the URL. Thus my URL, when clicked, searches multiple image search-engines for “The Painter and his Pug” etc with a size of more than 2MB. Of course, readers can do that for themselves, but it would be a nice future-proofing courtesy. Or what about ‘intelligent PDFs’ that do that for you, fetching and embedding the required image on-the-fly from wherever it can be best found? An AI might help with that, and perhaps the link might contain an AI-friendly formula for what the required image should look like (big red splotch here, eyes there, etc) to ensure that the correct one is fetched.
Publishers have until 10th February 2017 to submit suggested humanities book titles to Knowledge Unlatched. Selected books are made Open Access in perpetuity, albeit usually minus the cover art/design as part of the Creative Commons PDF. Losses are defrayed by a consortium of libraries.
106 Knowledge Unlatched titles currently show up in OAPEN and thus in JURN. Although 343 titles were unlatched for 2016, which means that a lot more are coming soon.