Google Public Data Explorer. Sift datasets from over 2,000 public dataset providers, or upload your own.
An ex-Intel VP named Avram Miller has spun the blogosphere an amusing tale in which Apple launches its Found search-engine in Autumn 2015, with a…
“new search capability developed by Apple [that] would revolutionize search”
Miller is said to be at the heart of the Israeli tech scene, so I guess he might have heard something about an Apple contract or quiet company purchase. But I’d have liked to hear just a few more ideas from him. Like maybe some speculation about an iWatch-enabled personal search that’s hands-free and search-box free. A stronger Google Now competitor is certainly something Apple needs. While Apple Siri’s voice work is impressive, it apparently taps into er… Wikipedia, Bing and a much-criticised Apple maps service. Google Search provides “just 4 percent of Siri data”. It would be more profitable for Apple, and a bigger blow to Google, if a Siri successor hooked seamlessly into an Apple fangirl’s entire Apple-o-sphere — hardware, software and services — in order to gain a psuedo-predictive ability to bring you what you probably need to know at any given moment or point in space.
Google Now already does that, of course. But only ‘sort of’, by drawing on your online Google activity + traffic reports, weather and event listings. So how to kill Google Now in its cradle, rather than simply compete with it? To do that, Apple’s predictive search might run from powerful machine-learning that’s been intelligently chewing on all your data for a whole year. All of it, from Big Data to small data: including your itemised grocery bills, your body’s geo-location and real-time biometric data, your home sensors, even a list of your boss’s personal foibles and your pet cat’s GPS-tracked movements. Plus all your online activity. So it really gets to know you, rather than trying to jam you into the mould of a rather dim weather-obsessed restaurant-hopping commuter. And it knows you in context, moment to moment. Apple is perhaps the only company that many would trust with such intrusive joined-up access to their life and work, so Apple might just be able to get sufficient traction. Admittedly Google is also in the AI race, but they certainly don’t have one just yet — despite their recent promising purchases such as the UK’s Deep Mind. What if Apple really has discovered a breakthrough in some back-bedroom in Tel Aviv?
Of course this is all just my before-breakfast speculation, just like Miller’s tale most probably is. But if Apple do have such a search strategy then they could certainly also provide the full range of hardware to support it, not simply a super-Siri in a wristwatch. To make the AI’s predictive algorithms mesh and work as intended, just augment your body / life / work / loved-ones with Apple’s beautifully designed range of expensive hardware and software. Ker-ching! They don’t even need to taint the service with ads. Apple would make money in the advertising gold-rush by “selling the spades” to advertisers — by which I mean, selling the means to comprehensively understand two very difficult markets: rich people who have discerning taste and a good education, and their smart tween kids. They would do this just as the affluent middle classes are set to expand by a few billion people across the world. They would do this just as the technology emerges that will almost totally wipe out ads from our experience, if we want that. Such a search strategy would let Apple retain its uber-cool niche by having an ad-free yet highly advanced ‘personal search’ assistant service, while freeing Apple from the daunting prospect of burning money to battle Google in the ‘research search’ AdWords market. The most lucrative part of the latter, product research by intending buyers, might even be predicted very early and taken care of by a Siri Purchase assistant (days before Google Now figured it out and pushed you to Google Search via some pre-formed keyword searches).
Ah, well… who knows? But it would be cool if a predictive search service might eventually be just a Siri-like voice quietly warbling into one augmented ear, with the AI backend constantly learning (from your natural replies and tone of voice) if the search result was useful/timely or not. For now, an iEar personal search assistant would at least help bypass the camera phobia that’s currently dogging Google Glass. Although it would not solve the problem that no-one in an office or on a commute wants to overhear their neighbour constantly talking to their assistant device.
Could be worn with any glasses, giving the glasses strut a peg on which to rest and also a pass-through hole into the earpiece.
Another reason to move from Chrome to Firefox, it seems. The latest beta of Chrome has removed the site URL…
In the most recent [beta] update [of Chrome] Google appears to have declared war on URLs. The Omnibox a.k.a “address bar” up top doesn’t display URLs in the latest Chrome Canary build, opting instead for an “origin chip”. … That’s not the only change. Since URLs are no longer displayed in the address bar the default text that will be displayed at all times is “Search Google or type URL.”
Google has released all its old Google Street View pictures, so we can travel back in time….
We’ve gathered historical imagery from past Street View collections dating back to 2007 to create this digital time capsule of the world. If you see a clock icon in the upper left-hand portion of a Street View image, click on it and move the slider through time and select a thumbnail to see that same place in previous years or seasons. Now with Street View, you can see a landmark’s growth from the ground up, like the Freedom Tower in New York City or the 2014 World Cup Stadium in Fortaleza, Brazil. This new feature can also serve as a digital timeline of recent history, like the reconstruction after the devastating 2011 earthquake and tsunami in Onagawa, Japan. You can even experience different seasons and see what it would be like to cruise Italian roadways in both summer and winter.
“A Google engineer has developed an algorithm that spots breaking news stories on the Web and illustrates them with pictures.”
I found a 2013 article from geoscientists who had tested Google Scholar: “Literature searches with Google Scholar: Knowing what you are and are not getting”. Although the body of the paper states that their test phrase was “wildfire-related debris flows”, the data shows they actually tested Scholar with the keywords wildfire-related debris flows. They reportedly found that…
“free articles were available in PDF format for 88% of citations returned by Google Scholar. They were available from open-access journals or via links to organizational sites where authors had posted their publications.”
However if you actually look at their linked search-results data file, then the above statement needs additional clarification. Since it’s clear that paywall articles from Elsevier, Springer and the like, appearing in their Scholar results, were being counted toward those “free articles”. It turns out that many of these were “free” only via a DigiTop proxy overlay for Scholar that is, in the words of DigiTop, “available to USDA employees only”. Nice if you work under the U.S. Department of Agriculture umbrella, but it seems that those outside have to pay.
Does Google Scholar perhaps need to add some kind of “paywall box detector” to its scraper bots? Then perhaps something like [PDF] [-||-] could be added on the right-hand column of the Scholar results, to indicate a PDF that’s “available maybe” — but which will prove to have a paywall that needs to be either backed out from or negotiated? And perhaps [PDF] [-~-] could indicate a genuine direct link to a bona fide PDF file?
Anyway… this is what geoscientists are talking about when they refer to wildfire-related debris flows. Seems like it might be a geological process that intelligent farmers, hiker-campers, and treeline homesteaders around the world would like to learn some precise details about…
Giant mudslides, basically.
Incidentally, the same wildfire-related debris flows search in JURN needs to be tightened up just a little for strong results. Using wildfire-related “debris flows” works better, though the first six pages of good results do stray just a little (to pick up what seem to be three articles about prehistoric ‘dinosaur-era’ debris flow events). Yet even on this test JURN appears to be doing about twice as well as Google Scholar in terms of getting open articles, once Scholar’s ‘false-positive’ paywall PDFs from Elsevier & co. are subtracted from Scholar’s results.
Paperpile has been reviewed by PC World magazine (4th March 2014). Paperpile is a browser-based competitor to Mendeley. It integrates tightly with Google services such as Google Scholar and Google Drive, and can also slurp academic PDFs “directly from Google search results”. I’d be interested to hear if it works with JURN. Once the found PDFs are in your Google Drive cloud storage, it’s reported that…
“Paperpile analyzes your papers and acquires all the necessary metadata by itself.”
Sadly it’s only for the Chrome browser, not Firefox. At present it seems to be just a personal workflow aid, since there’s no collective exposure of the found content to a single public search box (as is offered by Mendeley’s “Search papers” search box).
Most papers will be downloaded at speed, because they “seem they might be worth looking at later”. Yet if Paperpile were able to measure re-open rates, view duration and frequency, and the actual level of citation in a person’s finished project or work, then that would be an interesting basis for a bumping algorithm that could help power the results ranking in a public searchable catalog. Especially if Paperpile could broadly match or align your research interests with those of similar Paperpile users, in combination with a more standard citation analysis, to give you a tailored search experience. Although in practice I guess there would be huge and possibly unwanted feedback amplification loops generated by that approach, as search results could veer heavily toward the latest fashionable topics. Doubtless Google has this nailed down already, and there’s probably a Trendy Search Topic Surge Controller employed somewhere in the Googleplex.
The Google Cultural Institute website is new to me. It seems Google has a Pinterest, sort of. It appears to work in much the same way as Pinterest, but the pictures are drawn from images in various hi-res/open museum digitisation collections.
No ‘kitties in art’ collection yet, although searching for “cat” will get you a big kittie fix if you’re desperate.
One of Google’s public data-driven prediction systems has caught a cold, according to weighty new research…
“Google Flu Trends, which launched in 2008, monitors web searches across the US to find terms associated with flu activity such as “cough” or “fever”. It uses those searches to predict up to nine weeks in advance the number of flu-related doctors’ visits that are likely to be made. The system has consistently overestimated flu-related visits over the past three years, and was especially inaccurate around the peak of flu season — when such data is most useful.”
The doctors prescribe taking a healthy dose of national health statistics…
“Merely projecting current CDC data [doctors’ visits as recorded at the US Centers for Disease Control and Prevention] three weeks into the future yields more accurate results than those compiled by Google Flu Trends. Combining the two resulted in the most accurate model of all.”
Although one has to wonder about prediction feedback loops here. What if Google Flu Trends was actually right? But that Trends-watching doctors, carers and the public all put into effect various extra measures that stopped the Trends prediction from coming true in the longer-term six-to-nine week window? Or what about some kind of media amplification loop: more media chatter hits the news as the epidemic surfaces into the public mood, meaning that non-sufferers start using the relevant keywords more in social media?
It seems that Google Search have committed to their new code for displaying Google Search results, after trialling the changes last week and then withdrawing them. The changes break the vital browser addon GoogleMonkeyR. A temporary fix is to edit the GoogleMonkeyR userscript thus…
var list = document.getElementsByXPath(".//div[@id='ires']/ol/li[starts-with(@class,'g')]/div/parent::li");
var list = document.getElementsByXPath(".//div[@id='ires']/ol/div[starts-with(@class,'srg')]/li");
Confirmed as working with Google.com search. Fails when you switch the keyword through to Google News.
UPDATE, NOV 2014.
Still working fine for me, with a few tweaks…
2. I access Google Search via this URL, which has a parameter that limits search results to 15 per page…
15 fits nicely in three columns, which I also have set up in GoogleMonkeyR Prefs — which is the cog-wheel that appears top-right once you make a Google search.
3. Hide the “Searches related to test” element on the Google Search results page, by using the AdBlock Plus addon (right-click on “”Searches related to test””, ‘Inspect Element’, highlight whole ‘extrares’ element, click on red AdblockPlus icon, block). This bit gets hidden because otherwise it sits awkwardly between you and the numbered links that lead to the subsequent results pages.
Wouter has hacked out a Google Scholar API workflow today, sort of. I suspect the reason Scholar has never offered an API is the agreements Google has with the large commercial journal publishers and citation database providers.
I note that Google Scholar’s single author citations pages are now to be found in the main Google Search results…
Although it seems that if a prolific or influential author has two or more pages of citations, only the first will show up in Google Search. For example…
site:scholar.google.com/citations “graham harman”
In an unusual move Google appears to have created its own Custom Search Engine, Custom Search for K-12 Computer Science Education. For the benefit of those outside the USA, “K-12” isn’t the name of some obscure Linux module. It seems to be U.S. educational jargon indicating: “state schooling for kids aged 5 to 16”.
A new study, “Google Scholar and DSpace”…
“The average indexing ratio [in Google Scholar] for our sample of 10 recent DSpace repositories is 64.8%”
I wonder if the interface presentation has an influence? http://circle.ubc.ca/ is totally hardcore in presentation and keywording, and is indexed at 99%. Whereas http://dash.harvard.edu/ has a more student-friendly blog-like look and feel to it, and is indexed at just 26% despite the harvard.edu domain. But perhaps not, as I guess its more likely due to the presence or otherwise of good machine-readable metadata.
Just when blogs were making a comeback, after the inane collective Twitter-gasm of the last few years… today Google has removed “Blogs” from the switch-through options at the top of the Google Search results …
They’ve also recently made some pointed comments to bloggers about allowing spammy “guest bloggers” to use their blogs.
Google has announced that Google Chrome browser users will not be allowed to install their own choice of plugins, addons, and userscripts, from January 2014. Today I moved over to using Firefox, as a result. Here are my notes on the “how to” of the move from Chrome to Firefox, in the hope the notes may help a few others:
1. Backup any old bookmarks from any existing install of Firefox. It seemed best to start fresh, so I removed the old version of Firefox via a full uninstall.
2. Download and install the very latest Firefox. As this was a fresh install, the first time Firefox loads it should offer to automatically port over all your bookmarks, toolbar bookmarks, passwords, etc. from Chrome. (The tiny favicons will only reappear, next to bookmarks on your toolbar, when you revisit those bookmarked pages).
3. Tweak the Firefox interface. I prefer to get back to a retro look with Classic Reload-Stop-Go Buttons.
Then go View | Toolbars | Customize. While this Customize library window is open, you are able to drag around the navigation icons in the navigation bar. Get the icons positioned how you want them, then before you close the Customize library window choose “Icons + Text”. Then click “done”. This is how I like the top left on my browser…
5. Add some basic advert and click-jacking blocker add-ons:
NoScript (annoying initially)
And then in Firefox go to: Tools | Addons | Plugins and disable all the craptastic media-player plugins that ship with Firefox (RealPlayer and the like, ugh). I only left Flash on “Always Activate” — since the Flashblock add-on (above) keeps it under control.
6. Then block the web’s other annoyances with these add-ons:
Facebook Purity (and import any blocklist / settings from your Chrome version of F.B. Purity)
7. Add userscript capability to Firefox:
Greasemonkey (required for running all userscripts). Followed by…
GoogleMonkeyR. Vital for working with Google Search, in my opinion. I set it up to display results in three columns, and also to block several bits of Google Search cruft.
(To find GoogleMonkeyR settings: make any search in Google, then right-click on the grey cog. Bear in mind that ticking “Don’t display the Google Web Search dialogues” may prevent the search box appearing above the top of search results in Google Images, and Google Books).
Direct Links in Google Search. This forces direct URLs to be used in the search result links.
Google Hit Hider by Domain (blocks Google Search results by unwanted domain). Import your old Google Search blocklist from “Personal Blocklist (by Google)”, then use the de-duplicate tool in Google Hit Hider…
8. Finally, go to Tools | Options | General | Home Page. There paste in this handy home page URL, which will send you to the main Google Search when you click on the Home button in Firefox:
This special URL has certain parameters embedded in it, which:
* forces Google Search to use Verbatim (it searches on just what you type, not what it guesses you might want)
* sets the number of results to 18 (perfect with a widescreen monitor and GoogleMonkeyR using three columns)
* forces the top Search Tools open, displaying drop-down items
* forces Google Search to use its complete main USA index, without making an automatic switch to a local version
* and turns Google Search’s Autocomplete off.
The resulting ad-free nag-free search results layout, with GoogleMonkeyR and the above fixes:
9. You can use the same URL trick with a Google News search, dragged onto your bookmarks bar, thus:
Replace the keyword in the above URL with your own. Switch out “uk” for “us”, etc.
Also handy is this Google Books link, with parameters included:
10. Other Firefox add-ons that are also very useful:
* the free grammar and spelling checker After the Deadline + Menu Editor to reverse AfterTD’s impudent hijacking of the top of the right-click context menu in Firefox. Sadly there’s no way to have AfterTD use British English spelling.
* Bookmark Favicon Changer 2.0 (is the only one that works with the latest Firefox)
* Instasaver (Instapaper saver button for Firefox) (works with the latest Firefox including Nightly developer version, requires an Instapaper account)
* NoSquint (a nice flexible and easily resettable zoom tool)
A useful new Google Scholar feature: Library. Save a personal selection from your search results, then share that collection with others. Now to write a bot that auto-bookmarks just the open access articles :)
It looks like I’ll be switching back to Firefox as a Web browser, over Christmas, as Google Chrome is set to block install of all extensions that don’t come from its own extension store. There is no way I could tolerate Google Search without GoogleMonkeyR, or Facebook without F.B. Purity. After The Deadline is also not on the Chrome extensions store.
Google has rolled out a major upgrade to Search…
“The new algorithm, codenamed Hummingbird, … the first major upgrade for three years … is especially useful for longer and more complex queries. … more capable of understanding concepts and the relationships between them rather than simply words”