Yippy-die-day…

Sad to see that the search-engine Yippy, based on Bing, has yipped its last yip. The domain now bounces to DuckDuckGo. When its sources were last heard of, Yippy was a version of Bing but with a strong boost given to sites for complex coders, regex wranglers, javascript jugglers and HTML hammerers. Also technical hobbyist sites in other fields, it was said. As such it was rather useful on occasion. It was also nice that it didn’t freak out if you ran the same search seven or eight times or more. It tolerated drilling at depth, which Google now has problems with.

DuckDuckGo is partly based on Bing (a blend of Bing and Yandex, when its sources were last heard of). It appears to be unknown if there has been a back-end ingestion that makes it a replacement for Yippy. But a few initial tests suggest it might be a reasonable replacement, and may have had some of Yippy’s weightings plugged in. For instance try a search for…

“href.replace” regex script

But if you want a technical search for your field or hobby in 2021, with full indexing reach and relevance-ranking, it’s probably best to create a Google Custom Search Engine (CSE) and populate it with about 100 or so of the top relevant URLs.

Added to JURN

Music For and By Children

IMPAR: Online Journal for Artistic Research

Journal of Controversial Ideas

Border and Regional Studies

Ethnologia Polona

Dostoevsky Studies

IBERIA : An International Journal of Theoretical Linguistics

Interdisciplinary Egyptology (forthcoming later in 2021)

MINIKOMI: Austrian Journal of Japanese Studies

Journal of the European Association for Chinese Studies, The


Water Cycle

CC Search goes to WordPress.org

CC Search, aka Creative Commons Search Engine, is moving to WordPress.org, which has also “hired key members of the CC Search team”. WordPress CEO Matt Mullenweg also states on his blog of CC Search… “audio and video [are] soon to come” with the support of WordPress.org.

CC Search should not be assumed to be a one-stop solution. It appears, for instance, to be completely useless for DeviantArt. Presumably DeviantArt is traffic-shy in that respect, and its bots are being blocked there. If WordPress could find a way to have DA open up, that would also be a major boost to the service.

Your Ulthar battle-cat army has arrived…

Scan The World is a new site for free Creative Commons scans of real-world objects, aimed at people who want to waste time and money on making worthless plastic tat 3D-print delightful plastic objects. We’ve seen such sites before, but this one looks like it’s well-organised and commercial enough to succeed.

Sadly the 3D printing angle means that “objects” is often where it ends, as nearly all my test downloads under full Creative Commons were simple .OBJs and thus lacked the vital photogrammetric textures seen in the previews. Those that did have textures tended to be under non-commercial Creative Commons. Such as this fab 3D printable Cat Armour.

I somehow doubt has a medieval original, but there were medieval rocket-cats, so you never know…

Overall, despite the limitations and ads Scan The World is an ‘open downloads’ site and no sign-up is required to download.

Petal Search

Petal Search, an English search-engine by China’s Huawei and apparently with its own index. After cleaning all the unwanted cruft off the front-page with uBlock, it can be made nicely minimalist…

Images seemed the most useful. But it turns out the ‘HD’ filter is puny, regarding a mere 900 x 1200 as ‘high’. Still it’s possibly useful as a third-opinion on images, as it gives very different results than DuckDuckGo Images or Google Images.

News feels like Bing, but without the extreme timeliness and with a whole lot of local British news seeming to be filtered through a cheesy relayer called dailyadvent.com rather than going directly. At a guess that may be to comply with Chinese government requirements?

The main search is sprinkled with ad-cards, but these are easily removed with uBlock. Definitely not as good as Google for the first page. I suspect the problem is in running weaker semantics on the query rather than in the index.

One thing it is is fast. Very fast. They’ll be using their own ‘special’ routers, no doubt.

More UnSplash

A few years back I made three curated picks from the Unsplash CC0 Creative Commons collection, and posted these here…

* Libraries and archives theme.

* Creative Industries I theme.

* Creative Industries II theme.

The CC0 status was later changed. For instance, Alex Knight uploaded his Robot picture under CC0 Creative Commons, prior to the end of 2017. But Unsplash now has its own licence, under which it is claimed this formerly CC0 Creative Commons picture now sits.

The new licence is not that bad actually, on glancing through it… it seems to just prevents the big stock companies from ingesting en masse and re-selling. But now comes the news that the evil stock agency Getty Images is about to “acquire” UnSplash outright anyway.

So here’s another pick, and under the still relatively permissive non-Getty licence, before the purchase goes through and any changes start happening.

What follows is ‘Creative Industries III’. As before, images reduced a little to make them more wieldy, and a few spamming brands (Apple, Nike etc) were airbrushed away. Photographer names are in the file-name, and should be credited if used in print etc. A few of the older pictures (see my collections above) don’t pop up for me in search, where you might have expected to see them again, and may have since been deleted or moved. Useable “draw-on-the-screen” images are very rare, but I found two good ones.


Creative Industries III:

Daylight code writing (useful, as most such pictures are on black):

All-night code writer:

Guitar maker:

Leather crafts maker:

Glass crafts maker:

Pattern knitter:

Children’s book illustrator:

Graphic illustrator:

Special-interest magazine editor:

General designer using a Cintiq or XP-Pen:

Table-top RPG game designer / miniatures painter:

Pinball table designer:

Children’s party clowns:

Branch librarian / local documentary-maker:

Field researcher (a bit spammy, re: the brand, but the closest I could get which says ‘field research’):

Local history writer:

Philosophers / old books conservator:

Sports vehicle designer:

Vehicle livery colourist:

Hair stylist:

Local TV studio, junior camera operator:

Local TV broadcast station desk-jockey:

Live theatrical event desk-jockey:

Podcast interviewer/presenter:

Synth/trance musician / YouTube celeb:

Young stage drummer:

Young comics reader:

Children’s creative dress-up:

Community dance:

Censorship:

Google’s Live Caption, now on desktop PCs

Isn’t the Internet wonderful. Just this morning I was searching and wondering why is there no audio “automatic transcription” software for desktop PCs? This evening… Google’s Live Caption feature is now available on the desktop PC, via the Chrome browser. For free, and running locally and offline and without a Google login.

To enable real-time live subtitles (aka ‘closed captions’ or ‘live captioning’) as your audio or video plays back, first get the latest Chrome then go…

Advanced

Accessibility

Captions

> arrow icon

Live Caption

…and turn it on in Chrome. At this point a set of speech-definition files will be downloaded, to enable the real-time detection of what’s being said. While you’re waiting, set up the preferences for fonts and colours etc.

Those used to AI sets of 1Gb or more will find the Live Caption’s are downloaded in a few minutes, even on a slow connection. Other than the initial download of the definition files the services work locally on the PC and without a Cloud connection. So far as I’m aware this is the first time such a free service is available without a Cloud-upload being needed, still less in real-time.

For this reason I would expect to see third-party UserScripts relatively soon, to enable the transcription to be easily captured into an editable text file as it plays. The playback / transcription continues to run, even when Chrome is not the focus of what you’re doing on the PC, which should help with scripted capture. Obviously if you want the whole thing you would have to let it play back first, to get a full transcription.

Can a recorded .MP3 be loaded and work? As well as a live stream? Yes, it works very well. A podcast with a 90 year-old guy on a smartphone, and kind of ok-ish voice quality… it handled that well. In real-time.

As you watch it, it occasionally goes back and auto-corrects and seems to be doing this based on word context. So I’m guessing it’s not just speech-to-text, but also text-to-text context tweaking. But it can’t work miracles: “gorilla campaign” rather than “guerrilla campaign” etc. And swearing does get f****** bleeped out with asterisks. It can’t detect different speakers. You can’t copy-paste. Still, it’s going to be very useful, especially if you just want a few paragraphs for a quote. Until we get a capture script, you can do things like screen grab with Microsoft OneNote, which handles small fonts fine and can make text from a screengrab very easily.

Incidentally, if you want to edit your audio files first, the venerable freeware Audacity 3.0 is now available.

Block the Google Doodle from search-results pages

Google is trying to creep its Google Doodle visual-messaging, often unwanted and usually distracting, into the top-left corner of the actual search-results page. Block it in the Web browser with the UBlock Origin addon, and this simple filter command pasted into your custom ‘My Filters list’…

http://www.google.com##.BA0A6c

If your Web browser insists on adding http:// to the www. bit of the above address (and thus making it a clickable link) then remove this from the copied line. It should look like this…

There’s also a No Google Doodle UserScript, to remove the doodle from the main landing page.