“A Google engineer has developed an algorithm that spots breaking news stories on the Web and illustrates them with pictures.”
I found a 2013 article from geoscientists who had tested Google Scholar: “Literature searches with Google Scholar: Knowing what you are and are not getting”. Although the body of the paper states that their test phrase was “wildfire-related debris flows”, the data shows they actually tested Scholar with the keywords wildfire-related debris flows. They reportedly found that…
“free articles were available in PDF format for 88% of citations returned by Google Scholar. They were available from open-access journals or via links to organizational sites where authors had posted their publications.”
However if you actually look at their linked search-results data file, then the above statement needs additional clarification. Since it’s clear that paywall articles from Elsevier, Springer and the like, appearing in their Scholar results, were being counted toward those “free articles”. It turns out that many of these were “free” only via a DigiTop proxy overlay for Scholar that is, in the words of DigiTop, “available to USDA employees only”. Nice if you work under the U.S. Department of Agriculture umbrella, but it seems that those outside have to pay.
Does Google Scholar perhaps need to add some kind of “paywall box detector” to its scraper bots? Then perhaps something like [PDF] [-||-] could be added on the right-hand column of the Scholar results, to indicate a PDF that’s “available maybe” — but which will prove to have a paywall that needs to be either backed out from or negotiated? And perhaps [PDF] [-~-] could indicate a genuine direct link to a bona fide PDF file?
Anyway… this is what geoscientists are talking about when they refer to wildfire-related debris flows. Seems like it might be a geological process that intelligent farmers, hiker-campers, and treeline homesteaders around the world would like to learn some precise details about…
Giant mudslides, basically.
Incidentally, the same wildfire-related debris flows search in JURN needs to be tightened up just a little for strong results. Using wildfire-related “debris flows” works better, though the first six pages of good results do stray just a little (to pick up what seem to be three articles about prehistoric ‘dinosaur-era’ debris flow events). Yet even on this test JURN appears to be doing about twice as well as Google Scholar in terms of getting open articles, once Scholar’s ‘false-positive’ paywall PDFs from Elsevier & co. are subtracted from Scholar’s results.
Paperpile has been reviewed by PC World magazine (4th March 2014). Paperpile is a browser-based competitor to Mendeley. It integrates tightly with Google services such as Google Scholar and Google Drive, and can also slurp academic PDFs “directly from Google search results”. I’d be interested to hear if it works with JURN. Once the found PDFs are in your Google Drive cloud storage, it’s reported that…
“Paperpile analyzes your papers and acquires all the necessary metadata by itself.”
Sadly it’s only for the Chrome browser, not Firefox. At present it seems to be just a personal workflow aid, since there’s no collective exposure of the found content to a single public search box (as is offered by Mendeley’s “Search papers” search box).
Most papers will be downloaded at speed, because they “seem they might be worth looking at later”. Yet if Paperpile were able to measure re-open rates, view duration and frequency, and the actual level of citation in a person’s finished project or work, then that would be an interesting basis for a bumping algorithm that could help power the results ranking in a public searchable catalog. Especially if Paperpile could broadly match or align your research interests with those of similar Paperpile users, in combination with a more standard citation analysis, to give you a tailored search experience. Although in practice I guess there would be huge and possibly unwanted feedback amplification loops generated by that approach, as search results could veer heavily toward the latest fashionable topics. Doubtless Google has this nailed down already, and there’s probably a Trendy Search Topic Surge Controller employed somewhere in the Googleplex.
The Google Cultural Institute website is new to me. It seems Google has a Pinterest, sort of. It appears to work in much the same way as Pinterest, but the pictures are drawn from images in various hi-res/open museum digitisation collections.
No ‘kitties in art’ collection yet, although searching for “cat” will get you a big kittie fix if you’re desperate.
One of Google’s public data-driven prediction systems has caught a cold, according to weighty new research…
“Google Flu Trends, which launched in 2008, monitors web searches across the US to find terms associated with flu activity such as “cough” or “fever”. It uses those searches to predict up to nine weeks in advance the number of flu-related doctors’ visits that are likely to be made. The system has consistently overestimated flu-related visits over the past three years, and was especially inaccurate around the peak of flu season — when such data is most useful.”
The doctors prescribe taking a healthy dose of national health statistics…
“Merely projecting current CDC data [doctors’ visits as recorded at the US Centers for Disease Control and Prevention] three weeks into the future yields more accurate results than those compiled by Google Flu Trends. Combining the two resulted in the most accurate model of all.”
Although one has to wonder about prediction feedback loops here. What if Google Flu Trends was actually right? But that Trends-watching doctors, carers and the public all put into effect various extra measures that stopped the Trends prediction from coming true in the longer-term six-to-nine week window? Or what about some kind of media amplification loop: more media chatter hits the news as the epidemic surfaces into the public mood, meaning that non-sufferers start using the relevant keywords more in social media?
It seems that Google Search have committed to their new code for displaying Google Search results, after trialling the changes last week and then withdrawing them. The changes break the vital browser addon GoogleMonkeyR. A temporary fix is to edit the GoogleMonkeyR userscript thus…
var list = document.getElementsByXPath(".//div[@id='ires']/ol/li[starts-with(@class,'g')]/div/parent::li");
var list = document.getElementsByXPath(".//div[@id='ires']/ol/div[starts-with(@class,'srg')]/li");
Confirmed as working with Google.com search. Fails when you switch the keyword through to Google News.
UPDATE, NOV 2014.
Still working fine for me, with a few tweaks…
2. I access Google Search via this URL, which has a parameter that limits search results to 15 per page…
15 fits nicely in three columns, which I also have set up in GoogleMonkeyR Prefs — which is the cog-wheel that appears top-right once you make a Google search.
3. Hide the “Searches related to test” element on the Google Search results page, by using the AdBlock Plus addon (right-click on “”Searches related to test””, ‘Inspect Element’, highlight whole ‘extrares’ element, click on red AdblockPlus icon, block). This bit gets hidden because otherwise it sits awkwardly between you and the numbered links that lead to the subsequent results pages.
Wouter has hacked out a Google Scholar API workflow today, sort of. I suspect the reason Scholar has never offered an API is the agreements Google has with the large commercial journal publishers and citation database providers.
I note that Google Scholar’s single author citations pages are now to be found in the main Google Search results…
Although it seems that if a prolific or influential author has two or more pages of citations, only the first will show up in Google Search. For example…
site:scholar.google.com/citations “graham harman”
In an unusual move Google appears to have created its own Custom Search Engine, Custom Search for K-12 Computer Science Education. For the benefit of those outside the USA, “K-12” isn’t the name of some obscure Linux module. It seems to be U.S. educational jargon indicating: “state schooling for kids aged 5 to 16”.
A new study, “Google Scholar and DSpace”…
“The average indexing ratio [in Google Scholar] for our sample of 10 recent DSpace repositories is 64.8%”
I wonder if the interface presentation has an influence? http://circle.ubc.ca/ is totally hardcore in presentation and keywording, and is indexed at 99%. Whereas http://dash.harvard.edu/ has a more student-friendly blog-like look and feel to it, and is indexed at just 26% despite the harvard.edu domain. But perhaps not, as I guess its more likely due to the presence or otherwise of good machine-readable metadata.
Just when blogs were making a comeback, after the inane collective Twitter-gasm of the last few years… today Google has removed “Blogs” from the switch-through options at the top of the Google Search results …
They’ve also recently made some pointed comments to bloggers about allowing spammy “guest bloggers” to use their blogs.
Google has announced that Google Chrome browser users will not be allowed to install their own choice of plugins, addons, and userscripts, from January 2014. Today I moved over to using Firefox, as a result. Here are my notes on the “how to” of the move from Chrome to Firefox, in the hope the notes may help a few others:
1. Backup any old bookmarks from any existing install of Firefox. It seemed best to start fresh, so I removed the old version of Firefox via a full uninstall.
2. Download and install the very latest Firefox. As this was a fresh install, the first time Firefox loads it should offer to automatically port over all your bookmarks, toolbar bookmarks, passwords, etc. from Chrome. (The tiny favicons will only reappear, next to bookmarks on your toolbar, when you revisit those bookmarked pages).
3. Tweak the Firefox interface. I prefer to get back to a retro look with Classic Reload-Stop-Go Buttons.
Then go View | Toolbars | Customize. While this Customize library window is open, you are able to drag around the navigation icons in the navigation bar. Get the icons positioned how you want them, then before you close the Customize library window choose “Icons + Text”. Then click “done”. This is how I like the top left on my browser…
5. Add some basic advert and click-jacking blocker add-ons:
NoScript (annoying initially)
And then in Firefox go to: Tools | Addons | Plugins and disable all the craptastic media-player plugins that ship with Firefox (RealPlayer and the like, ugh). I only left Flash on “Always Activate” — since the Flashblock add-on (above) keeps it under control.
6. Then block the web’s other annoyances with these add-ons:
Facebook Purity (and import any blocklist / settings from your Chrome version of F.B. Purity)
7. Add userscript capability to Firefox:
Greasemonkey (required for running all userscripts). Followed by…
GoogleMonkeyR. Vital for working with Google Search, in my opinion. I set it up to display results in three columns, and also to block several bits of Google Search cruft.
(To find GoogleMonkeyR settings: make any search in Google, then right-click on the grey cog. Bear in mind that ticking “Don’t display the Google Web Search dialogues” may prevent the search box appearing above the top of search results in Google Images, and Google Books).
Direct Links in Google Search. This forces direct URLs to be used in the search result links.
Google Hit Hider by Domain (blocks Google Search results by unwanted domain). Import your old Google Search blocklist from “Personal Blocklist (by Google)”, then use the de-duplicate tool in Google Hit Hider…
8. Finally, go to Tools | Options | General | Home Page. There paste in this handy home page URL, which will send you to the main Google Search when you click on the Home button in Firefox:
This special URL has certain parameters embedded in it, which:
* forces Google Search to use Verbatim (it searches on just what you type, not what it guesses you might want)
* sets the number of results to 18 (perfect with a widescreen monitor and GoogleMonkeyR using three columns)
* forces the top Search Tools open, displaying drop-down items
* forces Google Search to use its complete main USA index, without making an automatic switch to a local version
* and turns Google Search’s Autocomplete off.
The resulting ad-free nag-free search results layout, with GoogleMonkeyR and the above fixes:
9. You can use the same URL trick with a Google News search, dragged onto your bookmarks bar, thus:
Replace the keyword in the above URL with your own. Switch out “uk” for “us”, etc.
Also handy is this Google Books link, with parameters included:
10. Other Firefox add-ons that are also very useful:
* the free grammar and spelling checker After the Deadline + Menu Editor to reverse AfterTD’s impudent hijacking of the top of the right-click context menu in Firefox. Sadly there’s no way to have AfterTD use British English spelling.
* Bookmark Favicon Changer 2.0 (is the only one that works with the latest Firefox)
* Instasaver (Instapaper saver button for Firefox) (works with the latest Firefox including Nightly developer version, requires an Instapaper account)
* NoSquint (a nice flexible and easily resettable zoom tool)
A useful new Google Scholar feature: Library. Save a personal selection from your search results, then share that collection with others. Now to write a bot that auto-bookmarks just the open access articles :)
It looks like I’ll be switching back to Firefox as a Web browser, over Christmas, as Google Chrome is set to block install of all extensions that don’t come from its own extension store. There is no way I could tolerate Google Search without GoogleMonkeyR, or Facebook without F.B. Purity. After The Deadline is also not on the Chrome extensions store.
Google has rolled out a major upgrade to Search…
“The new algorithm, codenamed Hummingbird, … the first major upgrade for three years … is especially useful for longer and more complex queries. … more capable of understanding concepts and the relationships between them rather than simply words”
Google has obviously demoted Google Scholar over the last year or so, as well as loosening the content-inclusion parameters. Max Kemman now asks: will Google close down Google Scholar? The article notes that…
“cited by” and “related articles” functionalities in Google Scholar […] are already available in [the main Google] Search
If he’s correct, there may be another reason for it. Have people in Google taken a good look at the slow-but-sure progress of Microsoft Academic Search, and found they don’t like what they see? Is Google wary of waking up one day to find that the Microsoft tortoise has once again executed its traditional killer slow-mo back-flip karate on a competitor hare?
How to do reverse image search in Google Images Search:
1. Find and copy the original direct URL of the image which needs identifying.
2. Go to Google Image Search and click on the camera icon in the search box…
3. A search dialogue box will open. Paste the image’s URL into the box, and search…
4. View results…
You can also upload an image, as well as just paste an URL.
A just-released Greasemonkey script Google Scholar Citation Exporter…
“Extension of Mayank Lahiri’s ‘Google Scholar Citation Exporter’ that prints results to CSV, for further use in other applications.”
The new Google Online Course Builder…
“our experimental first step in the world of online education”