In an unusual move Google appears to have created its own Custom Search Engine, Custom Search for K-12 Computer Science Education. For the benefit of those outside the USA, “K-12″ isn’t the name of some obscure Linux module. It seems to be U.S. educational jargon indicating: “state schooling for kids aged 5 to 16″.
A new study, “Google Scholar and DSpace”…
“The average indexing ratio [in Google Scholar] for our sample of 10 recent DSpace repositories is 64.8%”
I wonder if the interface presentation has an influence? http://circle.ubc.ca/ is totally hardcore in presentation and keywording, and is indexed at 99%. Whereas http://dash.harvard.edu/ has a more student-friendly blog-like look and feel to it, and is indexed at just 26% despite the harvard.edu domain. But perhaps not, as I guess its more likely due to the presence or otherwise of good machine-readable metadata.
Just when blogs were making a comeback, after the inane collective Twitter-gasm of the last few years… today Google has removed “Blogs” from the switch-through options at the top of the Google Search results …
They’ve also recently made some pointed comments to bloggers about allowing spammy “guest bloggers” to use their blogs.
Google has announced that Google Chrome browser users will not be allowed to install their own choice of plugins, addons, and userscripts, from January 2014. Today I moved over to using Firefox, as a result. Here are my notes on the “how to” of the move from Chrome to Firefox, in the hope the notes may help a few others:
1. Backup any old bookmarks from any existing install of Firefox. It seemed best to start fresh, so I removed the old version of Firefox via a full uninstall.
2. Download and install the very latest Firefox. As this was a fresh install, the first time Firefox loads it should offer to automatically port over all your bookmarks, toolbar bookmarks, passwords, etc. from Chrome. (The tiny favicons will only reappear, next to bookmarks on your toolbar, when you revisit those bookmarked pages).
3. Tweak the Firefox interface. I prefer to get back to a retro look with Classic Reload-Stop-Go Buttons.
Then go View | Toolbars | Customize. While this Customize library window is open, you are able to drag around the navigation icons in the navigation bar. Get the icons positioned how you want them, then before you close the Customize library window choose “Icons + Text”. Then click “done”. This is how I like the top left on my browser…
5. Add some basic advert and click-jacking blocker add-ons:
NoScript (annoying initially)
And then in Firefox go to: Tools | Addons | Plugins and disable all the craptastic media-player plugins that ship with Firefox (RealPlayer and the like, ugh). I only left Flash on “Always Activate” — since the Flashblock add-on (above) keeps it under control.
6. Then block the web’s other annoyances with these add-ons:
Facebook Purity (and import any blocklist / settings from your Chrome version of F.B. Purity)
7. Add userscript capability to Firefox:
Greasemonkey (required for running all userscripts). Followed by…
GoogleMonkeyR. Vital for working with Google Search, in my opinion. I set it up to display results in three columns, and also to block several bits of Google Search cruft.
(To find GoogleMonkeyR settings: make any search in Google, then right-click on the grey cog. Bear in mind that ticking “Don’t display the Google Web Search dialogues” may prevent the search box appearing above the top of search results in Google Images, and Google Books).
Direct Links in Google Search. This forces direct URLs to be used in the search result links.
Google Hit Hider by Domain (blocks Google Search results by unwanted domain). Import your old Google Search blocklist from “Personal Blocklist (by Google)”, then use the de-duplicate tool in Google Hit Hider…
8. Finally, go to Tools | Options | General | Home Page. There paste in this handy home page URL, which will send you to the main Google Search when you click on the Home button in Firefox:
This special URL has certain parameters embedded in it, which:
* forces Google Search to use Verbatim (it searches on just what you type, not what it guesses you might want)
* sets the number of results to 18 (perfect with a widescreen monitor and GoogleMonkeyR using three columns)
* forces the top Search Tools open, displaying drop-down items
* forces Google Search to use its complete main USA index, without making an automatic switch to a local version
* and turns Google Search’s Autocomplete off.
The resulting ad-free nag-free search results layout, with GoogleMonkeyR and the above fixes:
9. You can use the same URL trick with a Google News search, dragged onto your bookmarks bar, thus:
Replace the keyword in the above URL with your own. Switch out “uk” for “us”, etc.
Also handy is this Google Books link, with parameters included:
10. Other Firefox add-ons that are also very useful:
* the free grammar and spelling checker After the Deadline + Menu Editor to reverse AfterTD’s impudent hijacking of the top of the right-click context menu in Firefox. Sadly there’s no way to have AfterTD use British English spelling.
* Bookmark Favicon Changer 2.0 (is the only one that works with the latest Firefox)
* Instasaver (Instapaper saver button for Firefox) (works with the latest Firefox including Nightly developer version, requires an Instapaper account)
* NoSquint (a nice flexible and easily resettable zoom tool)
A useful new Google Scholar feature: Library. Save a personal selection from your search results, then share that collection with others. Now to write a bot that auto-bookmarks just the open access articles :)
It looks like I’ll be switching back to Firefox as a Web browser, over Christmas, as Google Chrome is set to block install of all extensions that don’t come from its own extension store. There is no way I could tolerate Google Search without GoogleMonkeyR, or Facebook without F.B. Purity. After The Deadline is also not on the Chrome extensions store.
Google has rolled out a major upgrade to Search…
“The new algorithm, codenamed Hummingbird, … the first major upgrade for three years … is especially useful for longer and more complex queries. … more capable of understanding concepts and the relationships between them rather than simply words”
Google has obviously demoted Google Scholar over the last year or so, as well as loosening the content-inclusion parameters. Max Kemman now asks: will Google close down Google Scholar? The article notes that…
“cited by” and “related articles” functionalities in Google Scholar [...] are already available in [the main Google] Search
If he’s correct, there may be another reason for it. Have people in Google taken a good look at the slow-but-sure progress of Microsoft Academic Search, and found they don’t like what they see? Is Google wary of waking up one day to find that the Microsoft tortoise has once again executed its traditional killer slow-mo back-flip karate on a competitor hare?
How to do reverse image search in Google Images Search:
1. Find and copy the original direct URL of the image which needs identifying.
2. Go to Google Image Search and click on the camera icon in the search box…
3. A search dialogue box will open. Paste the image’s URL into the box, and search…
4. View results…
You can also upload an image, as well as just paste an URL.
A just-released Greasemonkey script Google Scholar Citation Exporter…
“Extension of Mayank Lahiri’s ‘Google Scholar Citation Exporter’ that prints results to CSV, for further use in other applications.”
The new Google Online Course Builder…
“our experimental first step in the world of online education”
“Google Scholar: The Good, The Bad and The Ugly“, a short free Powerpoint from the University of Leeds in the UK. It’s a useful up-to-date summary, but I’d worry about the document’s opening claim that Google Scholar has… “Almost 100% coverage of journals from partner databases and publicly available TOCs”. This may mislead people into assuming that Google Scholar has complete coverage. It doesn’t. As I’ve said before, it is rather poor at including the contents of large numbers of open access arts and humanities ejournals.
Ocropus is Google’s OCR software, and it’s open source.
Google has added images to the Google JSON/Atom Custom Search API, enabling the construction of specialist image-only CSEs. Users of the API can have 100 free queries a day — and can purchase more at $5 per 1000 queries, for up to 10,000 queries per day.
The Google Desktop Search software became officially defunct toward the end of 2011. But one can still download the last 5.9.1 version, and it happily installs and indexes and searches the full-text of your content. For instance, a folder full of Gbs of PDF encyclopaedias and journal articles, ebooks, etc, presenting results in a familiar Google Search interface. Note the indexing has to be manually started by you, and this is done by right-clicking the taskbar icon and selecting “reindex”…
But if you need a personal desktop search product that’s being supported and developed, perhaps due to the need to index a new file-format, then the alternatives are…
* the free ad-supported Copernic Desktop Search. Well-reviewed and mature software. Can be a bit aggressive in its initial indexing, but then it works quickly and intuitively. There is also a Copernic Desktop Search Professional Edition. The best everyday replacement for Google Desktop Search.
* dtSearch Desktop (£119, PC World review from 2011). A very mature and powerful software, although the price of $199 will likely make it unappealing to personal users. The dauntingly powerful interface will also likely make it unappealing to small business users.
* the new X1 Desktop Search. The X1 website’s main landing page seems to be positioning the X1 range for the corporate market.
* DocFetcher 1.1 is a Java-based desktop search software, that’s open source and free. It’s been around since 2009, but doesn’t seem to have any genuine reviews (that I could find). Supports indexing of Open Office file types.
* the free built-in Windows 7 search. Although now tamed, and no longer the fearsome disk-grinding Windows Vista incarnation, in my view turning on Windows Search still makes a desktop PC too slow. Especially if you run a PC stuffed to the top with legacy files and emails.
Effective File Search (freeware)
I’m currently reading journalist/historian Steven Levy’s In The Plex: How Google Thinks, Works, and Shapes Our Lives (Simon & Schuster, April 2011). At the half-way point through the book (Google is at the stage of throwing billion-dollar data centers around the planet), I can say it’s is wonderfully precise on the ancient history of the company. I’ve taught lessons on the history of Google to undergraduates numerous times, so a lot of the events and personalities are familiar — but it’s great to now have a book that’s so authoritative. I’d previously read and enjoyed Levy’s Crypto: How the [Cryptography] Code Rebels Beat the Government, Saving Privacy in the Digital Age (2002), and his new book is just as nearly structured, well researched, and elegantly written. Highly recommended.
Google Web Fonts, a new Google service. It offers a snippet of code that styles your website with a font. The font streams in over the Web, so your website’s text looks to the same to all visitors. Although, judging by my experience of using a similar system with WordPress.com, it will slow down page loading. An especially nice choice for historians to experiment with might be Old Standard TT font…
[ Hat-tip: Beautiful Web Type ]
Hurrah! I found a viable way to automatically, reliably, and fairly simply grab a CSV of Google Search results. With URL, title (anchor) text, and even the sample snippet. This is, of course, only intended for academic use — to speedily build useful lists of subject-specific links.
1. Download the free MozBar addon for Firefox. It’s SEO stuff for webmasters, but it’s free and it works. Note that the CSV export feature is only present in the Firefox toolbar. Not the Google Chrome version.
2. Temporarily turn off any Firefox addons you might have for modifying the appearance of Google Search results, such as GoogleMonkeyR.
3. Go to Google Search, go to Search Settings, and turn on Google Instant if you have it disabled. Turn the number of results to 100. Save. Now do a test search.
No SERP Control Panel showing up? Click on the new SEOMoz toolbar (it’s sitting up near the top of your browser), click on the grey cogs, and select Google…
The SERP Control Panel overlay should now appear over to the right of the search results. Note that you may also need to repeat this step, for each new search or page, in order to get the data cued up correctly for a fresh CSV output, if you have Google Instant turned off.
4. On the SERP control panel, click on “Export to CSV”…
Note than we can also do this with Bing and Yahoo, and perhaps others if you can make profiles for them. Possibly it might work with Google Scholar?
5. Open the resulting CSV file with Excel…
You even get the description/snippet from the search results, although prefaced with some junk — simply delete everything in front of keyword “Undo” in the relevant column, by using Sobelsoft’s Excel Remove (Delete, Replace) Text, Spaces & Characters From Cells Addin for Excel…
Also delete the columns with the SEO junk in them. You now have three clean columns: URL, title, and snippet. Use a formula to convert these to pretty linked HTML in a fourth column, and/or paste them into a mega-file of subject-specific results for further weeding and sorting.
None of the above is as robust or simple as the broken Google Extract Data and Text, and it’s to be hoped that Sobolsoft fixes this software soon for Windows 7 + IE9.
Sad to say, but the three Google Search harvesting utilities from Sobolsoft no longer work on Windows 7 with Internet Explorer 9. The utilities are: Google Save Search Results; Google Extract Data & Text; and Excel Import Multiple Google Search Results.
I’m guessing that to use these today one would need to blow the dust off an old Windows Vista PC with something like IE6 or IE7 installed, although the problem might be due to newer versions of Visual Basic runtimes or similar. The utilities don’t run well on Windows XP (I tried one, on an old laptop) because the GUI layouts are truncated in it, vital ‘save’ buttons are unreachable, and the software can’t be re-sized.
Among possible fallback options, none of the Google URL Harvester scripts for Greasemonkey work now. The clunky Outwit Hub still can’t seem to get past Google Search’s URL obfuscation and other clutter, making it fairly useless for the task. SEO software like URLHarvester and Scrapebox doesn’t seem to care about link titles or extracts, just raw URLs and PageRank.
Still working is the basic per-page method of using Multilinks (Firefox) or Linkclump (Chrome), and then my combinatory Excel spreadsheet.
** Update: found a new, free way to do it, that also harvests snippets.