On linkrot

A new study of linkrot in Digital Humanities Quarterly, “Reference Rot in the Digital Humanities Literature”.

“[in the DHQ sample] over a quarter of sampled citations are links to websites. Over 30% of these references are [now] inaccessible or have additional access barriers.”

Perhaps we need a copyright-busting AI for this? Imagine that, with ‘one press of a button’, a ref-bot AI goes and visits/reads the reference links at the time of the article’s publication, and thus produces a unified set of summaries. Perhaps with each summary weighted towards topics being discussed in the paragraph before the point-of-citation. The result would then be offered alongside the published article, as an appendix. Since AI-made text cannot be in copyright, the publishers’s lawyers would presumably not swoon at such an idea. Of course, the author would then have to fact-check and human-approve it as correct. But that should not be to onerous.

Hide all Amazon search results containing a keyword

How to hide all Amazon search results containing the word Bluetooth.

Why use this:

i) Let’s say you are searching for wireless headphones. You want headphones with a proper radio-frequency wireless base-station that uses a rock-solid 100-yard range, and not those that use the infernal and unreliable Bluetooth system. You thus want to remove the vast number of Bluetooth headphones from your search results. But Amazon’s filtering system won’t allow you to do that.

ii) Or perhaps you simply want to remove all results with a title containing your own chosen keyword. Again, this assumes that Amazon lacks the required sidebar filtering, and that you have hundreds of results to manually trawl through. In which case, just change the keyword used below.

Required:

Use this simple code with the popular free Web browser add-on “uBlock Origin”, by adding it to uBlock’s filter list. Simply paste the code to the list and save.

! Hide all search results on Amazon which contain bluetooth in the title
amazon.co.uk##[data-component-type="s-search-result"]:has-text(/bluetooth/i)

Of course you should also change amazon.co.uk to whatever your usual national Amazon store is, if you’re not in the UK.

You should not find it also interfering with your Wishlist pages, but if you do then whitelist in uBlock’s ‘Trusted Sites’ thus…

www.amazon.co.uk/hz/wishlist/ls/*

Thanks to RraaLL for suggesting an improvement to my initial way of doing it. Post updated.

New book: Athena Unbound

A new free book from a UCLA historian, Athena Unbound: Why and How Scholarly Knowledge Should Be Free for All. Partly a history (no mention of JURN, though), and partly another stab at ‘how to make OA work’ in the future. There’s also a podcast interview with the author, albeit revealing some rather interesting assumptions. Such as…

“ChatGPT as I understand it at the moment scrapes and feeds off of the crappy end of the Web … I don’t think it’s able to get past the paywalls and into the scholarly databases and into the journals, as far as I know. So insofar as that’s true, then all we’re getting is a garbage-in, garbage-out product from ChatGPT … good ChatGPT should be based on the stuff that right now the paywalls keep us out of.”

The idea that worthy content is only to be found behind a paywall will raise an eyebrow among many OA publishers and indexers. He also makes the even more questionable assumption that piracy no longer exists in non-academic content (movies, games, TV, software etc). But those assumptions aside, his core points are thought provoking…

i) It certainly would be interesting if an AI could be trained purely on a critical mass of non-science / non-medical academic journal texts. On say… Sci-Hub’s PDFs, Semantic Scholar’s PDFs (which I’m assuming subsumes the DOAJ’s relatively small PDF holdings), and perhaps even all the PDFs that could theoretically be harvested after spidering JURN’s index URLs. So far as I’m aware, in the admittedly blisteringly fast development of AIs, there’s nothing like that just yet. Neither of those three give complete coverage of course. But even in a partial early form such an AI would be interesting to have.

ii) He also raises the question of copyright in the output of such journal-ingesting AIs. If the pure unaltered text product of an AI cannot be copyrighted, he suggest that many will come to prefer the AI’s potted answers over struggling with the actual (paid) articles from which it was hashed. I’d add that what they won’t prefer to do, most likely, is then to laboriously hand-check the AI’s factual claims, logic, references, etc that may trip them up in a follow-on use of the text. Also the errors of taste and historical knowledge that will likely occur with scholarly arts/humanities AIs, such as we already see in dumb taste-matching software on store sites — for instance assuming that Ziggy-era Bowie is the same as Eno-era Bowie and Tin Machine-era Bowie, or that if you like The Hobbit you will also enjoy The Silmarillion.

That said, Elon Musk and others are already reported to be working on fact-checking and check-able ‘citation finding’ AIs. Daisy-chained workflows between very different AIs will likely emerge, and doubtless there will even be AIs which can suggest and optimise such daisy-chains. Part of such chains will likely be AI modules which try to strip out “AI-ness” and also steganographic watermarking and suchlike, and attempt to add “human-ness” to the look and feel of the sale-able end product. Perhaps even filters for glaring “errors of taste” in matters relating to art and literature.

Release: PDF Index Generator 3.3 (May 2023)

A new version of PDF Index Generator, the first in a year. It’s the best standalone desktop software for making back-of-the-book indexes from finished PDFs. New in version 3.3 (May 2023), among other changes…

* Can now run-in the sub-headers (rather than doing list-style sub-headings). Video.

* Multi-page indexing (e.g. 265-278) can now be truncated (as 265-78). Video.

* Even/odd pages can now have their own margins set. Video.

* “Added an Include query … to index capitalized words”.

* Database files are now much smaller.

* “Fixed footnotes as it was showing footnote number & normal page number too!”

That last one is especially important for footnoting scholars. The footnotes feature was introduced in 2.9 (February 2020).

The software is still working all the way back to Windows XP, and is still the same price.

Added to JURN

Motifs (partly in English)

Zena-Lisssan : Journal of Academy of Ethiopian Languages and Cultures

Journal of Ethiopian Studies

Ethiopian Journal of the Social Sciences and Humanities

Ethiopian Journal of Languages and Literature

Africa Design Review Journal

Young Scientific Music and Dance Forum

Digital Age in Semiotics & Communication

Sledva : NBU Journal of Humanities and Arts

Yearbook of the Department of Foreign Languages and Cultures

Nairobi Journal of Literature, The

Zambia Journal of Religion and Contemporary Issues

African Journal of Entrepreneurship and Innovations

KIU Journal of Humanities (Uganda, Africa)

Africa Habitat Review (African planning and the built environment)


Annual of Natural Sciences Department (New Bulgarian University)

Phytopathology (plant diseases)

Bard AI now open, without a waitlist

Google’s Bard AI is now available, with no waitlist. This is what it told me about its capabilities, when prompted…

Initial testers of Bard were not impressed, when it first appeared. But this is Google, and they’ll improve it. For instance just over a week ago they plugged PaLM 2 into Bard. Which is said to add Python and JavaScript code-writing and debugging abilities, trained from a gazillion lines of working code on GitHub. Along with that comes a better understanding of maths and logic.

Worth another look (if you’re outside the EU, since it’s blocked there), and it’s pleasingly Google-fast.

It can also parse dense online PDFs into a Q&A format. For instance…

Browse URL_HERE.pdf read it fully and give me 10 questions and answers about this pdf’s content.

EU proposes drastic changes to paper book imports

Currently “small-value goods can be imported free of extra charges” into the EU. Small has meant that parcels valued under 150 Euros (about $160), don’t currently attract customs fees. German newspapers and others are now reporting this is set to be abolished… if an amendment to planned EU customs reforms, tabled by a French MEP, passes into law.

Hopefully this is just a bit of French anti-Amazon gesture-politics, and the amendment will be withdrawn or struck out. Before it can cause damage to the cross-border mail-order trade in books, journals, comics and BDs, small artworks and Etsy-like crafts, and suchlike.

But other moves are equally ominous, and suggest a wider aim among the EU’s MEPs. One reads of plans that would see all mail-order sellers forced to “charge customs duties and VAT [national sales tax] at the time of purchase”, and they would also have to register with a giant new EU Customs Authority and log all transactions and buyers. This seems likely to place a huge and disproportionate burden on small publishers and catalogue-based dealers, such as those selling paper books into the EU. Small-scale creatives are facing enough challenges (digital tax reporting, increased postage, rampant piracy, generative AIs, customers no longer buying due the cost-of-living, etc). They don’t need to be whacked with this is well.

Developing…

Abandoning the minus sign in search

A quick test to see which search-engines still respect the – minus sign, commonly used to exclude search results containing an unwanted word.

Test query: Tolkien 2023 -calendar

Tested: Bing, DuckDuckGo, Google Search, Google Scholar, Carrot2, eTools, Amazon.uk, YouTube, Yandex, Listen Notes (podcast search), Amazon, Internet Archive (“text contents”), Yahoo Search.

Results: The following still respect the minus sign:

* Google Search.

* Carrot2.

* eTools.

* Google Scholar (though there’s strange behaviour with this search – only two rather random results and a link to “see all”).

* YouTube, but only if you do it as…

Tolkien 2023 NOT -calendar

All the others failed. Yandex is unusable these days due to constant complex captchas, so couldn’t be tested. Amazon.uk doesn’t respect the – sign either, and lack of filters make it impossible (for instance) to remove all large-size headphones that use the infernal bluetooth for wireless connection.

Multi-column fix for DuckDuckGo

I made a super-simple fix for the current lack of DuckDuckGo multi-columns on a desktop PC. Multi-columns broke a few days ago.

Works with the Stylus UserStyle addon for your Web browser. Tested in Opera on Windows…

@-moz-document url-prefix("https://duckduckgo.com/") {
}
.react-results--main {
column-count: 3;
width: 1500px;
background-color: #eff2f7;
}
.wLL07_0Xnd1QZpzpfR4W {
display: inline-block;
width: 450px;
}

It’ll do as a temporary fix, until the vastly more complex DuckDuckGo – Multi-Columns – UserCSS (userstyle) v.40 is fixed.

Some forthcoming open journals

A quick trawl for forthcoming open journals, to be published in English, yields…

Historical Thinking, Culture and Education (teaching of history) (forthcoming)

Journal of Music Archaeology (forthcoming)

Hunara : Journal of Ancient Iranian Arts and History (forthcoming)

Philosophy of Physics (forthcoming)

Journal of Cycling and Micromobility Research (forthcoming)

Moduli (journal of the London Mathematical Society) (forthcoming)

Ocean and Society (forthcoming)


And I picked up another few that JURN was already indexing. These will be added to the Directory as links, in due course…

Syllogos : Herodotus Journal

East Asian Journal of Classical Studies, The

Transfer : Journal for Provenance Research and the History of Collection (partly in English)