It’s the back end of 2018 and there’s still no really useful and comprehensive search tool for recent blog posts, other than the main Google Search. And even that is iffy. Given that we’re approaching Halloween, I decided to do a quick group test with the simple keyword Lovecraft. He’s a good choice because so much utter trash floods onto the Web in his name. If a search can deal with Lovecraft, it should be able to handle much else.
* Google News: Can filter by ‘blogs’ and by ‘date’, but the results are laughable — are there really only eight blog posts on Lovecraft in October 2018, from worthy long-form and timely-news bloggers? I think not. (Another test for ‘Staffordshire’ suggests News | Blogs is almost all just press-release outlets and similarly worthless pseudo-blogs).
* Google Search: The inblogtitle:keyword modifier is no longer useful in search, as it now returns only 10 irrelevant results when used with Lovecraft. One used to be able to find sites that Google ‘knew’ were blogs, and had a keyword in their main blog title. Google Search has also removed
?tbm=blg from their URL options.
* WordPress.com internal cross-blog search: Simple to use, the results looks pretty, but it obviously has very mediocre coverage of its own blogs. Many expected and well-respected blogs do not appear at all. Users need to be aware that they are not seeing results from the entire range of non-spam WordPress.com hosted blogs.
I would suspect that DuckDuckGo may be using this WordPress.com results set as a de facto anti-spam whitelist, since that would explain its curious big gaps in the coverage for WordPress.com blogs. The same may be true of the dismal Bing — the only saving grace for which is the excellence of the Bing News | Most Recent results, which you can RSS-ise by adding
&format=rss to the URL. By comparison, NewsNow is nowhere.
* You Got Blogs, a Google CSE: Fairly good at pulling the top three currently-active blogs to the top of the results, but thereafter turns to mush. If the user then sorts by Date on a single keyword, the results are far less useful, mainly because You Got Blogs is indexing all *.wordpress.com/* pages rather than just the blog posts via *.wordpress.com/20*/* You Got Blogs is reliant on Google Search, since it’s a CSE, and thus for many blogs Google will only show the most recently-indexed post or else just the front page (e.g. you make seven posts a week, but Google will only show searchers the post it has most recently indexed, and the others will be un-findable). It’s thus an impossible balancing act for You Got Blogs (or any other blog-focussed CSE): if they don’t do a global index of *.wordpress.com/* then they miss a whole lot of results.
* Regrettably setting up a Google CSE (for *.wordpress.com and *.blogspot.com etc) is not an option. I’ve tried it and practice it doesn’t work well, when one sorts by Date. It’s sort-of-ok on a straight search, if making a first search looking for blogs on one’s topic, though the main Google Search would do better. A CSE picks up and lifts to the top of the results some very out-of-date and moribund blogs, and obviously can’t deliver usable sort-by-date results.
* Social Mention. Search restricted just to ‘Blogs’. Pathetic results from ‘Blogs’. No results at all, for ‘Microblogs’. Top three results were very similar to the WordPress.com internal, then a huge gap in time. My guess is they’re blending together the WordPress.com and Bing APIs, and to no great effect.
* DuckDuckGo: Should, theoretically, be good. But is mediocre. It all-but ignores key Lovecraft blogs, blogs which rank very highly in Google Search. I should note that the Duck is excellent in many other respects, especially the relevance of its Image Search. But is still lacks breadth and depth.
* Instant RSS Search Engine. No longer appears to work, even when tested in multiple browsers.
For niche news gatherers wishing to supplement their RSS feedreader and break out of the tiny-minded Twitterbubble, the best option at the end of 2018 is thus to set up a bookmarks folder in your Web browser with the following:
site:wordpress.com/2018/10/ “Lovecraft” -zombie -game -movie
site:blogspot.com/2018/10/ “Lovecraft” -zombie -game -movie
Vary according to your desired keyword and knockout words, obviously. These URLs will work because all blog posts on Blogger and WordPress have the date embedded in their URL.
These bookmarks should be set to run on Google Search and DuckDuckGo and Yandex (the latter with a
&lang=en English only filter in the URL). Right-click on the finished Bookmarks folder, select “Open All” and they all load.
Of course, this doesn’t pick up self-hosted blogs, only the free ones. And, obviously you’ll have to manually go in and incrementally change the date numbering in the target URLs, at the end of each month. Thus it’s not a perfect solution. (Nor can this solution be amalgamated into a Google CSE, for the reasons stated above).
Once the searches have loaded, switching through to a “week” or “24 hour” view will require the copious use of Google Hit Hider by Domain, to weed the spam and unwanted results. Google Hit Hider knocks out unwanted domains from search results, and does it very well. (Google Hit Hider can run on Yandex, it just needs the results reloaded, in order for its blocking buttons to appear).
Even having set up such a one-click Bookmarks folder, we also still have the problem of Google Search sometimes only offering the front page of a timely and frequently updated blog, rather than its most recent post URLs. In practice though, for a ‘last 24 hours’ search, you don’t actually need a site: modifier…
site:wordpress.com “Lovecraft” -zombie -game -movie
All you need is ‘last 24 hours’ filter alone, and Google Search will lift some of the best content into the first two pages of results. Kind of useful, as it can thus catch self-hosted blogs, albeit jumbled among legacy news sources and updating catalog sites etc. Even so, you’ll want Google Hit Hider when working at the 24 hour level.
Also useful, inside your new folder, will be a similarly hard-coded Google Images search URL for the last 24 hours or week…
“keyword” -pinterest -youtube -reddit -twitter -wikipedia -tumblr -instagram
… and so on. It only takes a few seconds to visual check the results, and such timely visual results are often useful re: new books, conference posters etc. Keep eBay listings in the mix as they can suggest interesting blog post topics, about old vintage stuff. Again, we’re not keying the search to blogs only, and thus Google Hit Hider is your friend here (it also works on Google Images results – block on Google Search, and it’s also blocked on Images).
There are of course also a whole bunch of “request a demo” agency services which claim to offer social media sentiment tracking. They seem to be of the ‘if you have to ask the price, you can’t afford it’ sort. There’s one free and public service worth a look, Social Searcher. Very slow to load a search, but it’s pretty and it works. It’s no use for blogs, though, but seems useful if you want to quickly glance across recent Facebook and Twitter posts. It covers some other ephemeral sharing sites, but their signal gets swamped by Facebook and Twitter. Not that that matters much as it’s almost all blather and parroting, of no news value. To prevent results turning into a wall of hashtags, the tags panels can be blocked in uBlock Origin with