Got itchy breadcrumbs in your Google Search results? The new UserScript Google Search restore URLs (undo breadcrumbs) works for me.
In the last week or so Google has made some slight changes to the default styling templates for CSEs, resulting in the numbered pagination links at the foot of the search results becoming very small and grey. This has now been fixed on JURN, and your per-page links to more search results should now look like this. They should be far more easily selectable now, and especially for touch-screen users…
My thanks to Amit Agarwal of India, for the elegant snippet of commented CSS for the .gsc-cursor-page element. If you have the same problem with your own CSE, this snippet goes in the style header of your page. Colours are controlled elsewhere, in the ‘Look & Feel’ | Customise | Refinement section of your CSE admin dashboard.
Changes may not show up until you and your users refresh your main page a few times, due to Web browser caching.
GRAFT has also had the same fix applied.
Also add padding for the pagination row, by adding the following to your CSS style (I have mine in the page itself)…
It appears that Google Search doesn’t track the Internet Archive (Archive.org) in anything like real-time for the useful content. For instance, see:
site:archive.org staffordshire -cannock -bbc
On this search you have to go to “Last year” to get anything useful from Google Search. With September 2018 being the latest datestamp I can see among those results. This gives the appearance that Google is only indexing Archive.org on a quarterly or bi-monthly basis?
Yet a search for…
site:archive.org + the ‘last week’
… does pick up material from Archive.org, but by the looks of it it’s only the utter rubbish, sex fantasies and spam that Google will want to rapidly exclude or make effectively undiscoverable. My guess is that there’s an ongoing low-level indexing of the new material purely in order to identify the junk, expose it to some user selection to try to sift out anything that’s a false-positive, and that this is then fed in as an ‘exclude’ junk-list for each larger quarterly re-indexing.
Here are some updated fix instruction for the latest GoogleMonkeyR UserScript, which many desktop power-searchers use to give their Google Search results a three-column multicolumn layout.
* Problem: the script breaks Google Image search results, by running on such searches. Specifically, the script appears to be preventing the central ‘slider’ div from opening up, when an image is selected from the Google Image search results.
* Solution: In your Web browser, access the raw GoogleMonkeyR script. For instance, in Opera this is done via: Extensions | Tampermonkey | Installed UserScripts | GoogleMonkeyR | Edit.
You then need to paste in a line of code that explicitly turns off GoogleMonkeyR, but only whenever the browser is running a Google Images search. To do this, add the following line to the header of your GoogleMonkeyR script, below all the // @include lines…
// @exclude http*://www.google.*/search?*isch*
Google Image searches have “isch” in their URL, so we can grab onto that and exclude such URLs. Save (click the disk icon) and exit. You should now be able to operate the Google Images results as usual, which still retaining your usual three-column layout for the main Google Search.
Google has a new Dataset Search tool. It looks good.
An initial test search for Krita (the open source paint software) didn’t pick up anything, so it is just limited to datasets and is not also bringing in general file-names from FTP servers.
A wide search for Antarctica Cephalopods then gave a good set of 25 results, all of which were record pages that appeared to place their dataset under CC or to be public domain (NASA etc). There doesn’t appear to be any way to then load a further set of results, or to do a further keyword search within the record-pages of the results.
Google Custom Search has slightly expanded the range of services.
The Standard and Non-profit CSE services are unchanged.
They also offer an CSE via a JSON API: there’s no Google branding on that, but you pay $5 per thousand queries, and are limited to 10,000 search queries per day.
The new and fourth offering is a “Site Restricted JSON API”: it also requires the same “$5 per thousand search queries” payment. But if you search across no more than 10 URLs, then there’s no daily traffic limit.
I guess a use-case for this would be a huge and very heavily-used corporation like Boeing, where you want to offer your clients the quickest and most accurate way to search across all your technical reports, papers and manuals — which are spread across 10 different URLs? That use-case would likely need some guarantees from Google, though, on the spread and depth of the indexing.
Under pressure from commercial image library Getty, Google Images has removed a key button from its search results. It’s the “View Image” button, which allowed people to view an image in isolation, against whatever colour they have set as a background for the Web browser.
The removal is easily fixed with a simple new script:
Chrome and Chrome-compatible: Google Search "View Image" Button
If you also want to change the default background colour (white can be better for screen-shots of logos for Facebook posts, to get an edge), in Firefox you can change the Web browser’s default background from black thus: Tools | Options | Content | Colours | Background | OK.
There are also press reports that the “search by image” icon in the Google Images search box is to be removed, also due to Getty pressure. But I see it’s still there on the UK version of Google Images.
In Autumn 2017 Google announced that Google Search would ignore the country domain of its service, and instead serve you national results based on what Google thinks your geographic location is…
“the choice of country service will no longer be indicated by domain. Instead, by default, you’ll be served the country service that corresponds to your location.”
Here’s my quickstart on some of the nation-specific research options which can route around this. You either need to:
i) use the likes of DuckDuckGo and add national URL Parameters to the end of your bookmarked URL: e.g. Hungary. Top results are not great in that instance, with BBC, Wikipedia and Guardian cruft, but they quickly become relevant as you scroll down. Adding site:hu helps a lot, at the cost of knocking out local grassroots blogs on WordPress and Hungarian .org and .com sites etc.
DuckDuckGo is now actually better than Google, in my opinion, for picture research. Though you will have to home-brew a Creative Commons filter within your search terms.
ii) Go to Google’s Advanced Search settings and (for now) you can request that Google Search “narrow your results” by nation. Clunky, but it may prove useful. I imagine there must be a browser plugin that allows this setting to be swiftly switched across various nations.
iii) use a VPN proxy in your Web browser. The Opera web browser has a free and sturdy VPN built in, but all you can do with it these days is to select broad regions rather than nations (as used to be the case). Adequate for things like quickly getting past region-blocking on public domain resources at Hathi, etc, but not that useful if you just want to research ceramics in Morocco.
iv) use a few free VPN such as Browsec. This offers three or four free national VPN nodes, of a limited access duration (10 minutes or so before it becomes unresponsive). Again, useful for researchers wanting to access region-locked Hathi books or YouTube videos etc. Such freebie VPNs also offer an enticingly big list of other national nodes for paid users…
v) The TOR browser. Google’s new move potentially leaves sensitive ‘business researcher traffic’ open to being snooped on and tracked by hostile/piratic nations, who may either clandestinely run and/or can tap into VPN traffic. As such, smaller business — especially those in a larger supply-chain but without security-savvy IT departments — might also look into the anonymous TOR browser’s capabilities before doing intensive country research. It’s my understanding that some TOR exit nodes can be geolocated to nations, while others appear to be free of geolocation, and apparently one can switch between these types and choose which nation the exit node is in.
So far as I’m aware, JURN has for some time now auto-detected your home nation and served results accordingly. Some types of user can route around this somewhat, by searching in a local alphabet and encasing words or phrases in quote marks (“مقارنة”) which in this case should mean the majority of search results are in Arabic.
I just ran a search on Google Scholar, and Scholar decided to present me with only two results (from Elsevier and Springer). The other 231 results (perfectly valid, often also from Elsevier and Springer) were hidden behind a small link to “See all results”. A curious new behaviour…
It seems we may need a browser add-on that forces “show all results” as the default page of results.
The new RSS change at Google News makes their existing keyword-based RSS feeds defunct. It affects the RSS feeds that collect all Google News items with a headline/snippet containing the words ‘bunny’ + ‘fluffy’, for instance. I don’t know if the generic catch-all ‘Science’, ‘Health’ etc RSS feeds are affected, as I don’t use those.
Those keyword-based feeds will now need to be changed. Changed slowly and manually and individually by slogging down the list in one’s RSS feedreader. It’s a big task to do, for some, and journalists and editors and bloggers will have hundreds (if not thousands) of these feeds set up.
So far as I can see there’s no way to export the OPML from one’s desktop RSS feedreader and then simply do a global search-replace of the Google News URL paths in Notepad++, then bring the OPML back in. The URLs are too complex and varied in their structures to allow that.
One way of tackling the change is as follows:
Aim: Open our list of feeds in Excel and extract only the Google News ones, thus making it relatively easy for a worker to run through them all and discover the new ones.
Software required: the free Notepad++ and MS Office Excel with Sobolsoft’s Excel Remove Text addin.
1. Export your OPML master file from your RSS feedreader / newsreader.
2. Right-click on this and open the OPML in Notepad++. Search/replace
"/>; and then manually go through and add a ; to the end of the remaining few lines which now lack them.
3. Search/replace all , (i.e.: all the commas) and change these to &&&&.
4. Save a backup of the changed OPML, then save another copy from Notepad++ — this time as “feeds.csv” which makes it a comma-separated Excel file. “But there are no commas left” you cry. That doesn’t matter, as Excel will treat the ; instances as if they were commas. And it won’t be terminally confused by commas sitting within the URLs, as we just changed them all to &&&&.
5. You can now load feeds.csv in MS Office’s Excel spreadsheet package. If you successfully put a ; at the end of each line of the OPML, Excel will happily load the file and it will display correctly, meaning in a similar way to the clear structured view you saw in Notepad++.
6. You’re now able to extract all the lines containing the phrase “Google News” and then do the same for “news.google”. There are a number of complex ways to do this, involving fiendish formulas, but a very easy way is with Sobolsoft’s Excel Remove Text, Spaces & Characters From Cells add-in. This gives Excel a number of very useful functions, including “Clear all cells not containing X”. Select all lines. Then clear everything not containing Google News. You can then ‘sort A-Z’, to get a neat list of all your defunct Google News feeds, one per line.
7. Select all lines with content in them. Then use the same add-in to “Remove all text before…”
xmlUrl=" (which is the query command in the URL). Then “Remove all text after…”
You can continue doing this sort of search/replace, and thus end up with a fairly clean set of the keywords and phrases and knockout -keywords which you were using for each Google News URL. For instance, you can search/replace
%22 with ” to get recognisable search phrases again, inside the URL.
If you have hundreds or thousands of these, they can now be passed to a gig worker at Fivver.com etc, tasked with working down your nicely cleaned one-per-line list to discover the new working RSS URLs from Google News. While they’re at it, you may as well pay them to discover the Bing News equivalents.
You may also want them to use a VPN in order to also snag the Google News USA equivalent URLs, if you’re in the UK etc. Although it appears possible that simply changing the end of the new URLs from
?hl=en&gl=US&ned=us does the trick and gets the USA version. Google News USA obviously has better coverage, and is perhaps updated more quickly. For instance, a UK-centric search for: newcastle-under-lyme -police in Google News UK has no search results. The same from the USA site has one valid result in a local freesheet two hours ago. Such timeliness may matter for journalists with deadlines to meet.
8. You don’t then need to create a new OPML without any Google News URLs, and try to import it back to your newsreader etc. That’s a hassle and the OPML will probably break. So it’s easier to just let the defunct Google News URLs sit there and do nothing, since they’re not doing any harm. Some newsreader software may eventually flag them as defunct, and may even offer the ability to mass-delete your defunct feeds after 1st December 2017. Apparently that’s the date Google has set for the current feeds to die altogether.
9. Once your Fiverr gig worker etc comes back with the new URLs, either add in your new working Google News URLs by hand, or (if you have lots of them set up) have your Fivver gig worker format them up as a valid OPML file for bulk import to your newsreader. That’s very simple to do, once you have a newly-working Google News sample line to show them, although I think there are website converters that will turn a one-per-line RSS URL list into a valid OPML with ease.
That’s the most efficient way I can think of for handling the changeover.