New: JURN ‘in a UserScript’

Have a UserScript addon in your Web browser? Then here’s a new script that lets you install a link to JURN into the top links bar in Google Search, thus…

With a search-query typed into the Google Search box, you then simply click on the new “Jurn” link. This sends your search-query over to the JURN CSE (hosted by Google), and runs it there. That’s all it does, but hopefully it will prove useful to many users.


To install for Google Search:

1. If you don’t already have one, install a UserScript handler addon in your desktop Web browser. [ Opera: Tampermonkey | Chrome: Tampermonkey | Firefox: Greasemonkey | Pale Moon: Greasemonkey for Pale Moon, etc.]

2. Then visit the GreasyFork page for the JURN UserScript. Click on “Install this script”. (If the “Install” button doesn’t show up, you may need to whitelist greasyfork.org in your script-blocker)

3. That should be all you need to do. Check that you have a new “Jurn” link on your Google Search top-links. You can manage the script and see its code via the UserScript handler addon in your brower.


To get a similar link at the top of DuckDuckGo:

A very similar link for the excellent DuckDuckGo search-engine can be added with the DuckDuckMenu UserScript.

1. Install the DuckDuckMenu UserScript.

2. Then visit DuckDuckGo and run a ‘test’ search. At the top of the page you now have a new list of links which can be configured. Clicking on a link here will take your DuckDuckGo search query and run it in another search service.

3. To add JURN to this new menu, click “Edit menu”. Remove any unwanted default links that ship with the script, then add JURN:

https://cse.google.com/cse/publicurl?cx=017986067167581999535:rnewgrysmpe#gsc.tab=0&gsc.q={searchTerms}&gsc.sort=

To also add Google Books and Scholar alongside your new JURN link, also add these links:

https://www.google.com/search?q={searchTerms}&tbm=bks

https://scholar.google.com/scholar?hl=en&q={searchTerms}

If you find that you get sent to local versions of Google, and want to skip the transfer, simply replace the .com bit with whatever your national Google domain is (e.g. google.co.uk).

Advertisements

Google’s Talk to Books

Talk to Books is a new Google service that tries to show book snippets directly relevant to your well-formulated query question.

I rather unfairly tried it with a fiendish question: When did J.R.R. Tolkien discover the word earendel? (without quote marks). None of the results were accurate, in contrast to Google Books where the top three results were accurate / useful / authoritative, as was the seventh result. Not only that, but the snippets were also more or less spot on.

It’s a tricky question, not only because lower-case earendel is Anglo-Saxon. There’s that use, and then there’s also the capitalised name Eärendil. Which is the name of the character Tolkien developed, after being inspired by the Anglo-Saxon word and its very complex clusters of meanings and associations. The other problem is that no-one knows exactly when in the first half of 1913 he found the word, and exactly where he found it — in a dictionary, a commentary or a footnote, or was he told about it by a tutor at Exeter College (he was having personal tuition with some of the great names in word-lore), or was his first encounter with the word while actually reading Crist in Old English? The word is the key root-hole for his work, from which the seed of his great legendarium later arose.

So, kudos to Google Books for getting it right in a useful way. Google’s Talk to Books on the other hand seems to have semantically smushed the words and ignored earendel. The name-authority was accurate, but it seemed to assume I just wanted something about Tolkien and his use of words in general, and tried a scattergun approach…

1. His fictional use of dwarf-names in the Icelandic Dvergatal (‘Dwarf table’), found in the Voluspa.
2. His early work on writing entries for the Oxford English Dictionary.
2. His mid-career attitude to the Celtic languages.
4. His entry for “Walrus” in the Oxford English Dictionary.
5. His early and reluctant abandonment of the worn-out word ‘fairy’ in his early poetry and invented-languages, in favour of ‘elf’.

There was no connection made between earendel and Eärendil, which suggests that perhaps Talk to Books might usefully add a semantic sub-system devoted to character-names, and their variants and mis-spellings?

Fixing YouTube

Here’s another post arising from taming my new Opera browser. This one lists some UserScripts and an addon to make the YouTube experience better. This is for my own future reference, mainly, but the list of links and suggestions may also be useful for others on desktop PCs.

Most of the time I just use the ever-reliable 9xbuddy service to download an .M4A audio-only file of the YouTube video, for listening on wireless headphones. I’ve always found the free 9xbuddy download service more reliable than a UserScript that places a “Download .MP3” link on the page itself. Regrettably no-one’s yet made a simple script to pass the YouTube page URL across to the 9xbuddy website, so using their service does require a manual cut-and-paste of the video’s URL.

On first landing at the YouTube video page, having the Disable audio/video autoplay UserScript installed is very useful. There are two types of autoplay on YouTube. This script simply turns off the first of these, the “Autoplay video on page load”.

Also useful, when first landing on a video, is the Chrome addon Hide YouTube Comments. All of the YouTube comments, gone. The Like/Dislike icons are kept.

In some cases I do want to play the video on the page, for example if it’s a software tutorial or a product unboxing/test. In that case the UserScript Disable YouTube 60 FPS (Force 30 FPS) makes playback faster on slow broadband, by forcing 30 frames a second rather than 60. This enables the viewer to step up to a 720p or even higher resolution, and thus to see fine details in the software’s user interface or on the product being tested. It’s especially useful for those who have a bandwidth-metered or throttled connection, or who have relatively slow rural broadband.

The UserScript YouTube Thumbnails is also very handy. Forget hunting along the sliding progress bar, squinting at tiny flickery video frames and trying to find the bit you want. With this script you just click on the word “Thumbnails”, and in pops a simple static storyboard for the complete video. This calls YouTube’s own frame thumbnails, and arranges them along a timeline. Then you click on a frame, and that jumps you to that point in the video.

As I said above, there are two types of Autoplay on YouTube. The second type is the one that autoplays the next video in the list, after you’ve finished watching the current one. Often this is a “suggested” video that YouTube “thinks” you will like. Given the current pitiful state of recommendation algorithms, this means the video is almost certainly unwanted and annoying. In this context the UserScript Disable YouTube autoplay is also useful. It simply automatically turns off the relevant Autoplay slider, just after the YouTube page loads.

There are of course various ways to change the colours on the YouTube interface to your taste, including UserStyles for Stylish. They break often, and may be more trouble than they’re worth.

More shiny Chrome

Three more Chrome addons, found today…

* LastModified. You land on a Web page, and it has a date like “16th May”. But, which year? Or perhaps it has no date on it at all. This handy little icon tells you the page’s exact last-update date, when you click it.

* Detect CCLicense. Auto-detects a Creative Commons statement on a Web page. Works, but is too subtle for me. I want the CC symbol to light up on detection, start pulsing a neon-green color at triple size, or otherwise do something that attracts attention. Especially when it hits CC-BY. Instead it just turns from grey to black, and doesn’t differentiate licenses. I couldn’t find anything similar, but more advanced. (Also has a native Opera version).

* Meow Met. Load a random cat from the Metropolitan Museum of Art collection, every time you open a new blank tab in your Web browser. With title and artist credit…

How to move from Firefox 55 to the latest Opera browser

I see that the adblocking addon uBlock Origin now offers reasonable Element Hiding functionality on Web pages. This may help those who feel they are stuck with Firefox 55 (because of the recent radical changes to the way FF handles addons), but who want to move to the likes of the Opera browser. GHack has a good guide on how to transfer your Adblock Plus / Element Hiding Helper addon settings: “How to migrate from Adblock Plus to uBlock Origin”.

This means that Opera now looks like the most viable alternative for those currently stuck with Firefox 55. I’ve thus been encouraged to make the switch to Opera, spurred on by one-too-many ‘Heartbeat’ nags in Firefox about the need to update (it doesn’t matter if you turn off updates, they still nag). What follows is mostly for my benefit, when I need to refer to it in the future, but it may be useful for others.


Here’s my tested step-by-step on moving from Firefox 55 to Opera 53:

1. Download and install the latest version of Opera. Familiarise yourself and discover how to get extensions (aka addons).


2. Apparently Opera has a built-in adblocker, but if you’re someone who stuck with Firefox 55 then you want one with a good Element Hiding feature. So install the addon adblocker uBlock Origin and migrate your blocklist from Adblock Plus to uBlock Origin as per Ghack’s instructions. You can also copy over your Element Hiding rules. It’s fairly straightforward, just plug in any missing list subscriptions, then copy/paste into uBlock Origin | My Filters and Whitelist.

uBlock Origin’s method of Element Hiding is fairly easy and precise, once you get used to how it works. I’m happy with it.


3. Install ScriptSafe, a good per-site script blocker and whitelister. As with all such addons, it’s annoying at first, needing sites to be whitelisted on a per-site basis, and then you go back in again and whitelist whatever else the page is calling in to make it work. After a day or two all your regular websites should be whitelisted.

I see ScriptSafe also handles CANVAS blocking, so no need for an additional addon for that. But you need to turn that feature on manually, via its “Options” (Icon | Right-click | Options). You can even spoof your timezone and browser type, but once again is needs to be turned on manually and is probably best left off until there’s some reason to use it.

Note that ScriptSafe doesn’t offer Location Guarding (e.g.: “Our website knows you are in Chicago, by the lake, in the cafe… SEND DATA TO THE SAUSAGE COMPANY!” etc).

Nor does ScriptSafe handle pop-ups. You may also want Popup Blocker Lite which offers fine-grained control over what pops and how. Although it’s going to very annoying for the first few weeks, as it works on a make-a-whitelist principle rather than a blacklist. Be aware that Opera has its own built-in pop-up blocker, which does work on a blacklist principle…


4. The Install Chrome Extensions addon also needs to be one of the first installs. Once this is in, it gives you access to Chrome store plugins such as Unseen (blocks FB’s privacy-invading “Seen by…” function) and Facebook Demetricator (removes pointless metrics micro-icons from the FB interface).

You’ll also want to be bookmarking and whitelisting the Chrome Web Store website.


5. The vital F.B. Purity for de-cluttering Facebook, of course. And Facebook Disconnect (stops FB from tracking as you travel you across the Web). Then transfer your F.B. Purity settings and block word-list from Firefox to Opera, which is easily done via the export/import of a text file.


6. Desktop search ninjas such as regular readers of this blog will want Stylish and its DuckDuckGo – Multi-Columns script, which gives DuckDuckGo search results a multicolumn layout. Make sure you uncheck “Send anonymous data to Stylish developers” after install. The same goes for TamperMonkey.

I have a DuckDuckGo MultiColumns guide on how to change the colours and results-number on the DuckDuckGo – Multi-Columns addon/layout. Also links to some other useful DuckDuckGo customisation scripts, though they require a UserScript handler addon — see the next step for that.

Other useful minor search-fixes are Google Search link fix (copy the real URL from search results, not Google’s garble); and the related Google Images Fix (add back the recently removed “View Image” button).

If you want your DuckDuckGo personal settings to stick when you exit, then in Opera you need to turn on “Allow local data to be set”. DuckDuckGo’s settings (accessed from the top-right of the home page) allow you to turn off nags, ads, autocomplete, favicons and other search annoyances.


7. Add the Tampermonkey UserScript handler addon. This enables you to install UserScripts. Desktop search ninjas will then want GoogleMonkeyR Fix (Jul 2017) which gives Google Search results a multicolumn layout suitable for a widescreen desktop monitor. Then Google Hit Hider by Domain.


8. The Browsec extension, a free VPN for pretending to be in the USA and thus getting past region-locking of Hathi Trust books etc. Yes, there’s a free VPN in Opera desktop, but it’s just been discontinued in the mobile versions of Opera and who knows how long it will last in Opera Desktop? Browsec is free, offers a backup to Opera’s VPN, plus the chance to pretend to be in Singapore (which has been useful more than once to get academic content).


9. A selected-phrase translator that will do (for now) is Google Translator Tooltip Expanded Fork, the most recently updated of the forks. It’s not inline but it’s fast and reliable, minimal, and easily configured.


10. For daily bloggers the oh-so-vital Redirect from the new WordPress.com editor to the classic WordPress editor, for old-school ease-of-posting to WordPress.com blogs.


11. Again for bloggers, Rich Copy URL for Chrome, for copying the current URL to the clipboard ready-wrapped in HTML link code. A nice time-saver for bloggers. It was the best one of about five addons I tested. It’s not right-click, but there was no good right-click option. This one uses an URL bar icon which you click on, and the code formatting for the URL can be easily configured.

A similar tool is Paste Email. When you just want to paste in your email address or other text snippet without typing it out.


12. Bloggers and news people will also be wanting a “Force an RSS Icon to Appear on the URL Bar”. Nope, there doesn’t seem to be one for Opera, not one that can send the feed to a desktop RSS reader. Nor a UserScript or Chrome addon. A pity. The closest is RSS Finder. Click its browser-bar icon, and if a RSS feed exists, you get a link that’s easily copied to the clipboard. It’s better than nothing.


13. Transfer Google Hit Hider list. Once the addons are set up, you then need to go back to Firefox, visit Google Search, pop out the “Manage Hiding” panel. Then export your Google Hit Hider list (there’s an import-export option) of all the Web domains you never want to see in a page of Google Search or DuckDuckGo results. I see that I have over 25,000 blocked URLs on my personal block-list. Notepad++ is good for copying and saving out as a list.txt file, but you can also just copy/paste.


14. Then move the commonly used bookmarks, bookmarks bar, and plug the passwords back in for the main services. If you go to Opera’s bookmarks you can do this automatically via the Import/Export option, which can also bring over cookies and passwords from Firefox. Figure on spending an hour re-sorting bookmarks.

Your Facebook bookmark bar link probably needs to be the secret Sort by date one, if you’re a blogger and timely information scooper and need your FB feed to run in date order.


15. Colour-fidelity and fonts.

If colour fidelity is important to you, then you probably need to turn off “Use hardware acceleration when available” in Opera. This is found in: Settings | Browser | System. Otherwise your colours may generally be rather washed-out. I’d rather have good colour, than a little extra speed.

Be careful in tweaking the Opera font settings. There’s no “just reset the font settings” button, and you can get into a font-settings tangle quite easily, with no way to back out of it. Be aware that DuckDuckGo has its own font settings.


16. Lastly, make Opera your PC’s default Web browser (Windows : Start | Control Panel | Default Programs | Opera). That’s it. You’ve left Firefox.


Other possibly useful Opera compatable addons:

* Reader View (similar to Firefox’s Reader View) and Instapaper, for getting a nice clean view of newspaper and magazine content. Used together they should meant that you rarely have to endure reading an article on the actual website of a newspaper or magazine, surrounded by clutter and tracking.

* Shareaholic for Pinterest. More reliable that the official one, in my experience.

* Turbo Button. Opera’s developers removed the Turbo on/off toggle button. This puts it back, enabling easy access to one of Opera’s most unique features.

* Opera has its own screenshot tool, but that only saves screenshots as a .PNG file which can be rather large in file-size. There are more capable screenshot tools out there, offering .JPG with variable compression, capture of scrolling windows etc.

* I don’t care about EU cookies. If you’re in the UK or Europe, this will help banish the EU cookies nags. You probably want to install this last, a few days after you’ve logged into Google, Facebook, PayPal etc for the first time with the new browser.

* Scholar on Google Search.

* Some scholars will prefer to open PDFs in their native desktop reader, rather than previewing the PDF inside of Opera. The setting to change for this is in: Settings | Websites | Download PDF…

When you click on the .PDF link in the built-in viewer it then downloads like any other downloaded file, but it won’t then auto-open with your reader software. Your software may have a “watch” capability to detect new PDFs appearing in the downloads folder, and it will then automatically open them up.

To auto-open a .torrent file with Opera, check the settings in your desktop torrent software. Many will offer to “watch” the default download folder for new .torrent files, and then automatically start them up.

You don’t need “Scroll back to top of page” button via an addon, in Opera you just click on the tab and you’re back at the top of the page. Click it again and you’re back where you were lower down on the page. That said, having a Scroll to Top button always be in the same place can be useful.

“I’m sorry Dave, I can’t do that…”

“I’m sorry Dave, I can’t do that…” That’s the default position from the makers of the Firefox Web browser.

It doesn’t matter that you’ve turned off all the browser’s Update settings (Options | Advanced | Update | “Never check for updates”) and so on. You’ll still be nagged. Regularly. With a slide-down into your screen, while you’re browsing, telling you the browser is ‘out of date’.

There is a simple way to fix it, and though it took a long time to find it, and applying it worked for me:

1) type about:config to get into Firefox’s deep settings.

2) Search for the config entry extensions.shield-recipe-client.enabled and ensure it is set to False. A double-click should toggle the True/False settings.

3) Exit about:config and restart Firefox.

In the end, though, this nag prompted me to make the switch to the Opera browser. Which I’ll detail in another post. Well done, Firefox guys.

Free OCR for German blackletter text

The free open-source Tesseract OCR 4.0 for Windows (beta, 64-bit), released 14th April 2018.

“The Mannheim University Library uses Tesseract to perform OCR of historical German newspapers. Normally we run Tesseract on Debian GNU Linux, but there was also the need for a Windows version. That’s why we have built a Tesseract installer for Windows.”

The Tesseract engine was apparently originally from Google, in use there at Google Books, but Google made it open source.

Tesseract 4.0 supports OCR in a range of old and ancient letterforms including German blackletter (aka Fraktur, in popular parlance ‘Gothic’), but these need to selectively enabled at install…

Once installed there are a few Windows GUI front-ends to choose from, with which to operate Tesseract. gImageReader is 64-bit Windows and current. On their forums I found a gImageReader beta version that is newly-compiled for Tesseract 4.0 beta. That needs to be launched in Windows Administrator mode, and then it also seems to require a Fraktur download, in order to handle OCR of German blackletter letterforms…

I’m assuming that gImageReader ‘knows’ where Tesseract 4.0 is, and hooks into it automatically. Because I didn’t need to set any file-paths to it, in gImageReader.

Once gImageReader is set up and the Frankur toggle/icon is switched, even when taking a screenshot the OCR results were pretty good…

It can also handle complete PDFs, and seems to go at about 15 pages per minute on a modern desktop PC. Nice to have, and (in combination with Google Translate) useful if your research takes you back to the German literature of pre-1938 — but you can’t read German and certainly not in blackletter.

There are probably online sign-up services that can do the same, these days, where you do a sluggish upload and have to deal with time-outs and usage-quotas etc. But I prefer the ease of having one’s own Windows desktop software.

Google Translate does PDFs

New to me: Google Translate now works on foreign-language PDFs. Perhaps it’s been available for a while, but I’ve seen no-one blogging about it.

It doesn’t work if you just right-click on the Web link to the PDF in, say, Google Scholar or JURN search results, and then select “Translate this page…”.

Instead you have to:

1) Right-click, and copy to the clipboard the direct PDF link.
2) Visit Google Translate, manually paste in the URL you just copied.
3) Click on the URL that appears over in the facing box.
4) The PDF text appears extracted, in the form of a Web page, and translated.

Very useful, and I had excellent results with a Polish article I tested. I had the whole article translated, too, not just the first few paragraphs.

Note that a ‘redirect URL’, which gives the PDF but hides the direct URL link to the PDF, is of no use in the above workflow.

Sadly I guess it’s also a route to plagiarism for students. I’d suggest that the anti-plagiarism detector-bot services might usefully build a bank of Google-translated theses and dissertations, to add to their phrase-detection sources. Teachers who mark suspiciously-excellent final dissertations, and who are then inclined ‘to go on the hunt’, should also be aware of the possibility that the lacklustre student may have run a foreign dissertation through Google Translate and then lightly re-written it for clarity in English.