Did you know you can create and share a custom-search ‘channel’ on Google News? Google News Directory.

29 Tuesday Mar 2011
Posted in JURN's Google watch
Did you know you can create and share a custom-search ‘channel’ on Google News? Google News Directory.

22 Tuesday Mar 2011
Posted in JURN tips and tricks, JURN's Google watch
Firefox 4 final is out now. Sadly it breaks the Greasemonkey script Google Noise Reduction, which was an excellent per-domain results blocker for Google Search.
However, the new and powerful Google Hit Hider does work very well, and is very similar. It’s obviously learned a lot from earlier software like Blocksite, Surfclarity, and Noise Reduction (all of which no longer work with FF4 / the latest Google) and there are some nice refinements. Not the least of which is very easy import/export as simple plain-text lists of URLs.
It’s a fairly simple process to get your hand-crafted Noise Reduction blocklist out of Firefox and into Google Hit Hider…
1. In Firefox’s address bar, type: about:config
2. Scroll down to greasemonkey.scriptvals.http://exego.net//Google Noise Reduction.blacklist You’ll see…
({‘britannia.com’:true, ‘oxfordjournals.org’:true, ‘tandf.co.uk’:true, ‘ingentaconnect.com’:true, ‘sagepub.com’:true, ‘myspace.com’:true, ‘experts-exchange.com’:true})
3. Double click on the line of banned URLs you’ll find there, and copy them to Notepad.
4. Now just top-and-tail the list, then search and replace until you have a clean list, but leave each URL separated by a single comma. Save the list as a .csv (comma separated value) file, then open that with MS Office’s Excel (or whatever the free Open Office equivalent is). The list should load up with one URL per cell.
5. Now just copy and paste the resulting cleaned list into: Manage Hiding / List Util / ‘Perma-ban list’ in Google Hit Hider.
The advantage of this over the now-native Google blocking is that: i) it lets you break the 500 URL limit; ii) you can block domains en-masse rather than one at a time; and iii) it lets you easily import/export the blocklist, in order to share with colleagues etc.
06 Sunday Mar 2011
Google’s official ‘Personal Blocklist’ add-on for the Google Chrome web browser has now reached version 1.4 (released 28th February 2011). The new version adds a huge leap forward in search-results blocking, import/export of blocklists…
New features in version 1.4:
— import a list of patterns [URLs]
— plain-text pattern list for export
This is really excellent. For those needing a ready-made list to import, the free Waster List has over 2,000 URLs (domains to block, or ‘patterns’ as Google calls them). To install them, just open up Personal Blocklist in Chrome and copy-and-paste thus…
1. Find the little Google Blocklist icon in the top-right corner. This gives you access to the control panel. Click on Import. These are added to your existing block list.

2. Simply paste the entire Waster List into the panel, and click Import.

You’re done!
01 Tuesday Feb 2011
Posted in JURN's Google watch, Spotted in the news
A new service from Google, Google Art Project…
“Explore museums from around the world, discover and view hundreds of artworks at incredible zoom levels, and even create and share your own collection of masterpieces.”
Based on the Google Maps technology and its familiar interface, the images are gigapixel and presented without watermarks. Just 17 gigapixel images to start with, and there are also StreetView-like tours of their museums. If images look a little blurry as you zoom in, then simply give time for the tiles to load (in a similar way to Google Earth), and the sharper tiles should appear.

The “Add” button for the creation of personal collections doesn’t seem to work in Firefox.
20 Thursday Jan 2011
Posted in JURN's Google watch
Interesting new Google search modifier…
inblogtitle:keyword
12 Wednesday Jan 2011
Posted in JURN tips and tricks, JURN's Google watch
A Google Search Results blacklist add-on for the Chrome browser. It’s called Search Engine Blacklist. I tried it, but unfortunately it requires you to manually copy-and-paste every URL you want to block. Version 2.6 does apparently add… “ability to blacklist from the search results page” — but I couldn’t find any sign of that either on the results page or in the Options. Firefox’s excellent and robust Greasemonkey blacklist script Noise Reduction for Google & Bing has a neat little button to click, placed next to each search result.
06 Thursday Jan 2011
Posted in JURN's Google watch, My general observations
A big dollop of lazy journo-bluster has landed at The Guardian, over the amount of outright spam that’s been inveigling itself into the Google search-results.
This growing so-called backlash is largely down to some users thinking they can still type in dishwasher review and get good results. Those “two keywords is enough” days are over — just spend 50 minutes learning how to search properly, guys. Yet some people are going to find learning this more difficult than others — more and more people who not fully literate are now trying to use the web. They can’t skim-read the results very well, or remember how to do complex strings of search modifiers. The ‘advanced search’ forms scare them. All the more reason why we need to be teaching search literacy from infant school onward.
Perhaps the Googleplexers who do nothing else but weed for spam are being temporarily overwhelmed? There’s an obvious tidal wave of robot-registered domains being populated by robots with robot-made pages. 99% of this Web spam has never seen a human hand, other than in the plagiarised material that gets pirated, semi-garbled, and pasted into the page. So, hire as many people as it takes to rip out the spam. It’s not as though Google doesn’t have the cash to throw another 500 eyeballs at the problem.
The other problem that people seem to be raising in the Guardian comments is that we don’t really have a reliable hand-made search-engine for product reviews, one that is devoted to serving only reliable reviews from reliable sources — and nothing else. Certainly, I’ve never found one I like and feel I can trust, and which is comprehensive in its sources and relevant to the UK.
04 Tuesday Jan 2011
IndexTank, custom search in a box. Nice idea. But it seems to be aimed at individual business looking to reduce their IT overheads, and is useless as a replacement for a Web-wide Google CSE…
“IndexTank doesn’t actively fetch data from you as a web crawler would do. Instead, your application sends IndexTank the data as soon as it is created or updated”
“not a standalone web search engine, and we don’t currently have a way for you to set it up directly through the Web. It requires downloading software such as a WordPress plugin (if you wanted to add better search to your blog, for example) or writing a program to interact with our servers.”
Worse, it can’t even auto-extract indexable text from the PDFs you send it…
“IndexTank, like other full-text search alternatives, indexes only text. However, for common formats like PDF or Word, it is very easy to parse them to obtain the readable text by using open source tools.”
I should mention some of the other ‘sort-of’ search-in-a-box options.
* The old and vulnerable (in the light of the Delicious closure) Yahoo BOSS
* Spinn3r. But it can only supply “A-list” blog content (so possibly not much use for hyperlocal indexing of a city-region), and you have to build your own widget to hook into its API.
* 80 Legs is a pricey monthly-subscription web-crawler. I’m uncertain if their stated ‘URL limit’ refers to the number of URLs on the originating site-list, or the number of files actually found by their crawler. If it’s the latter, you could run out of space very fast.
* And of course the new Blekko, which lets you upload a text file full of your selected URLs, and then uses them to create a ‘slashtag’ that delimits people’s searches. The last one is interesting, and I might eventually have a play around with it. Although possibly that’ll be when you’re no longer limited to 1,000 URLs, and are allowed to use wildcards in the URL list.
It’s great to see some competition emerging to Google CSEs, and perhaps it will eventually spur Google into offering a commercial ‘Deep’ Web-wide version of the Custom Search Engine:— full-text deep indexing of all the documents found at any website it’s pointed at; all the documents found are drawn on to produce your custom search results, every time; and the user gets 12,000 URLs to play with. Or perhaps Microsoft Bing will offer such a service. It might be limited to non-profits, so as to keep the SEO spivs out.
22 Wednesday Dec 2010
Posted in JURN's Google watch, Ooops!
Spamming Google Scholar. Very possible, or so it seems…
“…we conducted several tests on Google Scholar. The results show that academic search engine spam is indeed – and with little effort – possible: We increased rankings of academic articles on Google Scholar by manipulating their citation counts; Google Scholar indexed invisible text we added to some articles, making papers appear for keyword searches the articles were not relevant for; Google Scholar indexed some nonsensical articles we randomly created with the paper generator SciGen; and Google Scholar linked to manipulated versions of research papers that contained a Viagra advertisement.”
Beel, J. (2010)
Academic Search Engine Spam and Google Scholar’s Resilience Against it.
Journal of Electronic Publishing 13 (3), December 2010.
15 Wednesday Dec 2010
Posted in JURN's Google watch
A new Google search modifier… AROUND.
apples AROUND(3) pears
…gives results that contain the word “apples” within three words of “pears”.
[ Hat-tip: Researchbuzz ]
10 Friday Dec 2010
Posted in JURN's Google watch
Google has implemented a new filter that allows the filtering of search results by ‘reading level’. It’s accessed via the Advanced Search page, thus…

In a search for the term “reading level”, with the Reading Level set to Advanced, I still had a basic About.com page in the first page of results, as well as this blatant SEO spam page as result No.8.
A search for ‘tolkien + symbols’ showed better results, with a solid and useful first two pages of results. Although not that much different from the standard search, except that using Advanced Reading Level blocked a result from the scumbag SEO spam domain directhit.com on the second page of plain results.
20 Saturday Nov 2010
You may have spent some time building up a list of banned URLs for the Firefox addon Surfclarity, which strips unwanted domains from Google Search Results. Surfclarity no longer works with the latest Google changes, but the Greasemonkey script Google Noise Reduction does. In this tutorial we’ll swop the Surfclarity blacklist into the Google Noise Reduction blacklist.
1. In Firefox’s address bar, type: about:config.
2. Scroll down to extensions.surfclarity.patterns
Double click on the line of banned URLs you’ll find there, and copy them to Notepad.
3. Scroll further down to greasemonkey.scriptvals.http://exego.net//Google Noise Reduction.blacklist and take a look at the format. Note that it’s a little different than Surfclarity…
({‘britannia.com’:true, ‘oxfordjournals.org’:true, ‘tandf.co.uk’:true, ‘ingentaconnect.com’:true, ‘sagepub.com’:true, ‘myspace.com’:true, ‘experts-exchange.com’:true})
So we’re going to have to do some basic search-and-replace on our Surfclarity blacklist. Back up the Google Noise Reduction.blacklist if you want, as we’re going to overwrite it in a few moments.
4. Go back to Notepad and look at the list of Surfclarity URLs you just copied out.
Search for : and replace with : ‘ — note the space after the “:”.
Then search for : and replace it with ‘:true,
Now add ({‘ to the very start of this list, and ‘:true}) to the very end of this list.
Congratulations, you now have your SurfClarity list in Google Noise Reduction format.
5. Copy your new list to the clipboard, go back to greasemonkey.scriptvals.http://exego.net//Google Noise Reduction.blacklist, clear what’s in there at the moment, and then paste the new list in. You’re done.
Obviously, you can now also copy a backup of the Google Noise Reduction.blacklist
14 Sunday Nov 2010
This ten-step tutorial gives you a Google Search for grown-ups, with all the training wheels taken off. Speed has not been compromised. The tutorial assumes that you’re using Firefox as your primary Web browser, that you use Google Classic when you’re signed-in to Google, and that you’re searching the Web from a widescreen desktop Windows PC.
This is not an optional “pick and mix” list, unless stated. Everything is needed, and sometimes one element relies on another element. All browser addons and scripts listed below are free, and come without any adware/spyware. They have been tested by myself. All are working happily together, and all are working with the latest roll-out of Google Search (as of 14th November 2010).
Step 1: If you don’t already have Firefox, please download and install it. If you have Firefox, make sure it is upgraded to the latest version.
Step 2: Download and install the Firefox Addon AdBlock Plus. This will automatically remove the bulk of advertising from Google search results pages.
Step 3: Download and install the Firefox Addon GreaseMonkey. This is the absolutely vital addon, the one that will run the GreaseMonkey scripts listed below.
Step 4: Close and relaunch Firefox, to ensure that the new addons are recognised and running.
Step 5: Now download and install all of the following GreaseMonkey scripts. All these have been tested, and they work together without conflicts. We’ll need to configure some of them, but we’ll do that once you’ve downloaded and installed them all. Installation is as simple as clicking two buttons.
For Google Search:
* Google Preview Killer. No pop-up visual previews.
* Google 100. Lets you have more than 10 search results per page.
* New Google Ad-block. Blocks a few subtle types of Google Search ads not blocked by AdBlock Plus.
* Linkify Google Search Results. Makes the green Web address below each result useful.
* Scrub Google Redirect Links. Don’t allow Google to wrap search result links in gibberish code.
* Google Searches Exactly What You Type. No dumb second-guessing of what you want. You may need to turn this off if you find it causes looping on Google Books.
* GoogleMonkeyR. Allows all sorts of tweaks.
* No SearchWiki. Removes the little icons that appear next to each search result when you’re signed in (that’s right, the ones you never use).
* Customize Google Nav Bar. Lets you change and/or configure the topmost strip of text menu-links that lead to other Google Services…

* Hide/Toggle Google Sidebar. A very and neat elegant solution.
* Google Always Show Search Options. The search-by-date and other drop-down options on the sidebar are always fully opened up. This is needed because the Disable Autocomplete script breaks the sidebar’s menu-opening functions.
For Google Image Search:
* This is optional, but recommended. Google Image Basic + Direct Images. The first forces Google Images search to revert to how it looked and worked before it was Bing-ified. The second script adds direct links from thumbnails to the original images. They both work, and work well together.
Step 6: Now ensure that all those GreaseMonkey scripts you just installed are set to work on whatever the URL of your Google Search Home Page address is. Right-click on the smiling monkey icon in the bottom-right of your Firefox web browser. Select “Manage User Scripts…”.

Now scroll through the list of scripts, and check that each script is set to work on the correct Google web address. The scripts are pre-configured for this and should work fine without adjustment, but it may be that you use an obscure variant of the address for the Google home page. If in doubt, use something global like: http://*google.* (* = a wildcard, or instructions to ‘accept anything here’).

Step 7: Now we need to configure three of those GreaseMonkey scripts we just installed. The others should work fine without adjustment.
a. Google 100: Visit Google Search | In Firefox, go to Tools -> Greasemonkey -> User Script Commands | Choose “Set Google results per page” | Enter the number 18. Save and exit.
b. GoogleMonkeyR: In Firefox, go to Tools -> Greasemonkey -> User Script Commands | GoogleMonkeyR Preferences…
Then apply these preferences and Save…
c. Customize Google Nav Bar: This one is a little tricky to customise for non-coders. If you want to configure the links in the topmost menu of Google’s services…

… then right-click on the smiling monkey icon in the bottom-right of your Firefox web browser. Select “Manage User Scripts…”. Highlight the Customize Google Nav Bar script in the list. Click on Edit. Now make any changes by cutting or rearranging blocks of code thus, then Save…
Step 8: Explore your new Google Search experience. No Ads, No Autocomplete, No Reflowing of results as you type; No Pop-up Previews, No Fade-in, No code-wrapped URLs, No “did you really mean?”, No unwanted buttons or menu items, and with a neatly hidden (but available) sidebar. And with a couple of few nice discreet extras, and an elegant presentation that takes full advantage of your widescreen desktop monitor.
I’m assuming, of course, that you’ve already turned off Google Instant in your Google dashboard settings…
Step 9: I also like control over what results appear in Google Search results. I sometimes want to ban whole domains from ever appearing (e.g.: never see an informaworld.com page in search results ever again) in the results. This is optional, but recommended. The best solution for this has always been the excellent Firefox addon SurfClarity. However, this has stopped functioning due to the recent radical changes at Google Search, as have most of the similar GreaseMonkey Scripts. Until SurfClarity is repaired, the only domain blocker for Google Search that I can recommend and that I know is working with the new changes, is the GreaseMonkey script Noise Reduction for Google and Bing. It’s fairly elegant on the page, but does leave blank gaps where the erased search results were. The only real problem is that you can’t import/export the list of blocked domains. It also works for the Bing search engine.
Step 10. Sigh, relax, and enjoy a distraction-free searching experience.
13 Saturday Nov 2010
Posted in JURN's Google watch
Remove the new Google Image Search’s increasingly annoying ‘Bing-bling’, by using Firefox + GreaseMonkey + a potent combination of Google Image Basic and Direct Images in Google Image Search!. Image search then reverts to how it used to be. Clicking on a thumbnail in the search-results takes you straight to the largest version. When searching for images “larger than…” you may need to tell Firefox (one-time only) what application to open the image with, rather than popping up a “where would you like to download this to…” I told it to open large images with Firefox itself, and large images then open in a new Firefox tab. Nice.
And, while you’re at it… Flickr: link all sizes.
11 Thursday Nov 2010
PC Mag‘s pundit John C. Dvorak calls it today…
“You can see the beginnings of Google’s ruin already … the recent and more aggressive changes have been terrible … It doesn’t take a genius to see that Google is beginning to make huge judgement errors.”
Much as I love Google, I’ll admit to a similar uneasiness in recent months. Wild and often silly experimentation with the core search results appears to be a product of chasing “the dumb market”. It’s also possibly a reaction to the apparent lack of innovation in search itself — exemplified by what seems to be the obvious failure (*)of Google Caffeine to suppress spammy search results and SEO spivvery. I’d wonder if yesterday’s global 10% pay rises at Google, aimed at stemming the outflow of people from the company, might be linked to this sense of failure?
Perhaps better to split the basic search almost in two, via the configuration options. Give people who don’t want to switch to a Firefox/GreaseMonkey/scripts solution a single tick box in the Google Options dashboard that says, in as many words…
“I not a drooling idiot, please take all the silly training wheels off.”
Google also needs to invest far more heavily in free high-quality online training in how to search effectively. And to push it into schools at the junior level under the rubric of ‘search literacy’.
More commentary on the Dvorak article at Beyond Search. He thinks the article harsh, but concludes…
“What’s unfolding now is little more than visible signs that a systemic problem is disrupting functions. … The digital Black Death has taken root.”
* “Some 22.4% of Google searches done since June [2010] produced malicious URLs, typically leading to fake antivirus sites or malware-laden downloads as part of the top 100 search results, according to the Websense 2010 Threat Report published Tuesday”
16 Saturday Oct 2010
Posted in JURN's Google watch, Spotted in the news
Bing has announced it will allow users to bias search results, based on what their ‘friends’ of Facebook are searching for…
“Bing will incorporate users’ social data from Facebook to improve the personal relevance of your search results starting today. Facebook founder Mark Zuckerberg, Bing leader Qi Lu, and longtime Microsoft online veteran Yusuf Mehdi announced the news at Microsoft’s Mountain View office this afternoon.”
It’ll be interesting to do a test when the service rolls out, to find out if it actually improves the results or not. I suspect not, since most people are i) bad at searching, and ii) not using Bing.
03 Sunday Oct 2010
Posted in JURN's Google watch
Google has just launched their URL shortening service (http://goo.gl/) as a public service.
28 Tuesday Sep 2010
A summary of the Summer 2010 ROI Research survey of 500 search-engine users, just released…
“19% abandon the online search, taking it offline if they can’t find the information”
20 Monday Sep 2010
Posted in JURN's Google watch
It seems Google’s annoying “second-guessing” function has now been added to Google Blog Search. For instance, a search for “Lovecraft” now also picks up blogs that use the word “craft”, albeit not until the second page — presumably this is done on the assumption that the searcher doesn’t really know what they’re looking for. It’d be great if Google would allow advanced searchers to turn this dumb feature off, so we don’t have to keep on typing a + sign in front of keywords and phrases.
Update: a day later it seems to to have returned to normal.
23 Monday Aug 2010
Posted in JURN tips and tricks, JURN's Google watch
I’ve expanded my guide to how to build and install your own self-hosted linked Google CSE, and published it as a new book…