How to replace Instapaper or Pocket on a Windows desktop

How to replace Instapaper or Pocket on a Windows desktop, with no Cloud service required:

1. Install the Web browser addon Save as eBook from the Chrome store. This adds a neutral little button icon next to your Downloads icon.

2. When you visit an article in a magazine or newspaper, place your cursor at the top or bottom of the article and then drag to manually select the headline and text. Using the new addon, then “Add selection as eBook chapter”. The selected article is saved locally in the addon. It can handle scholarly-sized articles, and didn’t balk at a 14,000 word essay.

3. When you have enough articles saved in that way (10, 15, 20, a week’s worth, whatever suits you), open the Chapter Editor via the addon. You can reorder the saved articles as you see fit, and then you press “Generate eBook”. Only .ePub output is supported, and each article is saved as a chapter. Having chapters is vital on a device like the Kindle 3 ereader, as it’s then easy to skip back and forth between articles rather than laboriously paging through.

You can also title the eBook with a date.

4. Those using the older dedicated Kindle e-ink ereaders (rather than newer all-purpose Kindle tablets), and who want a Kindle .MOBI with chapters intact, can convert using the Calibre software. Chapter sections should be retained in the conversion. (If not, look at Settings in Calibre). Assuming you want a weekly batch of articles in a single file, then the Calibre .ePub > .Mobi conversion shouldn’t be too tedious to do.

5. Then either do… i) a manual USB transfer, ii) a local Wi-Fi transfer, or iii) an online “Send to Kindle…” operation from Windows.

6. When you’re ready to start saving another batch of articles, first delete the old ones from the addon.

There are a few drawbacks…

* There’s no automatic ‘readability’ detection of the body text, with stripping of page-fluff, ads, pictures etc. To do that you have to do a single sweep down the page to manually select the headline, byline and body of the article.

* Even then, it’s not possible to omit images from your selection. For instance by turning off image loading in your browser. The addon is obviously copying the underlying HTML code of the selection, not simply the visible text. That’s why it doesn’t work alongside ‘readibility’ or ‘read view’ addons. It’s calling the images from the HTML when it makes the eBook. The addon maker might usefully add a “Never save images” toggle, in future. And enable multi-part selections so as to skip mid-article text ads, or just select ‘headline/byline + several parts of the article’.

* You don’t make a ‘record’ of all the articles you’ve saved over time, other than in the form of the eBook output itself. There’s no .CSV list to download, with headline title, URL and suchlike, as there was with Instapaper. But if you archive the .ePub files, that serves as a full-text searchable archive.

* Nor is there any means of pushing your saved and linked article list to RSS for public consumption. But there are better link saving-and-sharing tools available for doing that.

Advertisements

How to block eBay sellers from appearing in search-results

Some picture researchers will be tracking certain place-based keywords in eBay, looking for new public-domain pictures of historical locations, old folk-costume and similar. You don’t necessarily want to limit your search to just “postcards”, as that will omit much. So you do a search with a wider range. But then how do you filter unwanted users who regularly clutter your location-based search results with unwanted items such as key-fobs, fridge-magnets, stock or reseller aerial-photography images, and heavily watermarked images?

There’s a handy Web browser script for that problem. EBay: Custom Page Controls And Seller Block List, a UserScript for blocking users who regularly clutter your search results with unwanted items. Like all UserScripts, the script can only work inside a handler add-on TamperMonkey UserScript handler.

I should point out that EBay does provide the ability to block natively, but only per search, and not as a perma-block.

Usage of the script is fairly straightforward.

1). First, we need to enable usernames in search results. Do any eBay search. In the right-hand corner, above the results, there’s a “View” drop-down. In this you’ll find “Customise”. Put a check-mark in “Seller information” and click “Apply changes”. This is a one-time action and it ensures that the seller name turns up in results listings in future. And that means that the UserScript can hook onto the names of unwanted sellers and remove them from the results.

2). Now install the script. If you’ve hidden eBay’s useless “Browse Related” taste-matching sidebar, unhide it (for instance, by temporarily turning off your uBlock Origin ad-blocker). That’s because the Block List control panel shows up at the top of that sidebar.

3). Put a check-mark in “Prune Results” (see above screenshot), then add a few user-names of unwanted sellers. It’s just a question of copying the username, pasting it in the Seller Block List, and pressing “Enter”.

4). Once you’ve done a few test searches and are happy you’ve barred all the unwanted sellers, you can turn uBlock Origin (or any similar ad-blocker) back on, and thus once again hide the useless “Browse Related” and other unwanted page sections. This means you’ll no longer see the Block List’s control panel in the sidebar, but the script still go on working to block the unwanted users.

Szczepanski’s List

My thanks to Jan Szczepanski for alerting me that his list has a new home. Szczepanski’s List of Open Access Journals is now available via the EBSCO.com website, as a freely downloadable .DOCX Word file. The page describes this 35,000-title / 9Mb / 672-page subject-ordered list as… “probably the world’s largest list of open access journals in the humanities and social sciences” plus Law, Statistics and Geology. No “probably” about it, this fine work is surely the largest OA-only list which is also freely available to the public. The EZB is at a similar scale, and does have 37,600 OA titles since 2010, but the EZB covers all disciplines including the sciences.

I had assumed the list had long since stopped updating circa 2015 and was now a legacy item, but the new header now notes “2,932 titles added 2017”. The file’s internal datestamp says the last update in Word was made March 2018. Unfortunately there’s no “First included in (date)…” component to the list entries, by which one might extract just the new 2016-18 finds. Nor does EBSCO appear to offer a rolling “What’s New”.

Having found the list again I’ve now added a link to it on the JURN Guide / Directory / FAQ pages. Despite its inevitable 15% linkrot (see below for details on that) Jan’s list has a big advantage over the JURN Directory. In that it can serve as a discovery tool for OA titles published in languages other than English, and which are unlikely to be on the DOAJ for various legitimate reasons. In contrast, the JURN Directory only lists titles which publish in English or are genuinely multi-language with English.

The home-page URLs given in Jan’s list appear to be hyperlinked with blue-underlining, but my copy of Word won’t allow me to launch these. That’s probably just a security thing on my PC. A simple “Save as…” to a .PDF does give me launchable URLs.

One can also save the list from Word to a filtered .HTML page, which gives a 23Mb Web page with launchable URLs. Despite being so large, my Linkbot software didn’t have problems with my saved Web page version of the list. 60 minutes later the Linkbot results reported that the list has the disadvantage of link-rot:

* Contains 45,536 individual home-page URLs across approx. 35,000 titles (some records have more than one URL). Of broken…

* After discounting the useful redirects (e.g.: http:/ to https:/ or the ubiquitous OJS redirects from the home-page to the current issue page), an Excel tally told me there were 6,808 ‘fatal’ URLs such as outright 404s, ‘Unauthorized’ or ‘Host not found’. That’s a 15% rot rate.

Perhaps someone could now look at helping Jan with a crowd-funder, to pay some Web-elves (on Fivver.com or similar) or unemployed recent graduates to fix the broken 15% of URLs? At $20 per batch of 25 URLs, I figure it should cost around $6,000.

Evaluating Access of Open Access Research

“Practicing What You Preach: Evaluating Access of Open Access Research” (2017)…

“To explore the effectiveness of the new OA [DOI-based] finding tools, the next step of the study used the Chrome extensions for Google Scholar, Lazy Scholar (LS), Unpaywall, and the Open Access Button (OAB) to look for green OA versions of paywalled articles. [At 160 articles] The study sample size was triple the amount of articles that Grandbois and Beheshti (2014) found in their study.”

Newman Numismatic Portal

The Newman Numismatic Portal at Washington University in St. Louis has gone through all of Hathi and Archive.org and picked out all the numismatic journals and catalogues. These are now in a handy A-Z, the links of which lead to date-ordered lists of volumes and embedded page-viewers. Excellent work.

Sadly there are not yet tables-of-contents in HTML on the site, so the article titles can’t be indexed by JURN.

New type of Custom Search Engine

Google Custom Search has slightly expanded the range of services.

The Standard and Non-profit CSE services are unchanged.

They also offer an CSE via a JSON API: there’s no Google branding on that, but you pay $5 per thousand queries, and are limited to 10,000 search queries per day.

The new and fourth offering is a “Site Restricted JSON API”: it also requires the same “$5 per thousand search queries” payment. But if you search across no more than 10 URLs, then there’s no daily traffic limit.

I guess a use-case for this would be a huge and very heavily-used corporation like Boeing, where you want to offer your clients the quickest and most accurate way to search across all your technical reports, papers and manuals — which are spread across 10 different URLs? That use-case would likely need some guarantees from Google, though, on the spread and depth of the indexing.

Added to JURN

Canada and Beyond : a Journal of Canadian Literary and Cultural Studies

Japan Review (The Japan Institute of International Affairs) (not to be confused with the Japan Review from the International Research Center for Japanese Studies).

Freeside Europe (Kodolanyi Janos University, Hungary)

English Literature : Theories, Interpretations, Contexts

Lucerna (newsletter on Roman Britain, from the Roman Finds Group) (added to Directory only)

Pennsylvania Libraries : Research & Practice

Syntaxis : an international journal of syntactic research

Funes : journal of narratives and social sciences

Tema : Journal of Land Use, Mobility and Environment

+

FULIR, the main repository for advanced research from Croatia.

How to stop YouTube’s new animated ‘thumbnail previews’

How to block YouTube’s annoying and pointless new animated ‘thumbnail previews’ of videos in the search-results.

1. Open your uBlock Origin browser ad-blocking addon.

2. Go to uBlock’s “My Filters” tab.

3. Add …

i.ytimg.com/an_webp

…to any blank line, either at the top or start of your Blocklist, as you prefer. This leaves preview thumbnails intact, while stopping the ‘animated GIF-like’ previews that play when you mouseover the search result.

A similar rule should work in other ad-blockers.

4m Open Library books, full-text, deep search

You can now ‘search inside’ all 4m Open Library books held at Archive.org, with your search seemingly constrained to just those books (and not the jumble that Archive.org also hosts). Nice results, with multi-snippets from deep inside the full-text of the books, plus phrase highlighting. This looks like excellent work, and it takes advantage of new tweaks by Archive.org’s search leader Giovanni Damiola.

A serious history researcher is still going to need to pound Archive.org itself and go through everything, but at first glance this seems to be a useful time-saver for those who only need to search the upper layers of the service.

The ultimate goal of the Open Library is “One Web page for every book ever published”. Think of it as one of those annoying university repositories where 95% of the full-text is not available yet, but will be one day… so “here’s a record page instead”. But in this case it’s for all books, and already has a substantial amount of full-text for free.