Removed Palgrave journals from the JURN index, since their generous offer of “all journal articles for free” during March 2014 is finishing today.
New laws in the UK will soon mean that…
scientific Facts can be extracted and published without explicit permission [something that is set to become] law on June 1st.
The Shuttleworth Foundation has a concise round-up of the measures, plus Web links to the British government’s ‘plain English’ PDFs about the new measures. Oh, and that old-fashioned CD-ripping-to-MP3 thing becomes legal too.
Who is JURN for?
* independent scholars and researchers
* students and lecturers in developing nations
* unemployed or retired lecturers
* recent university graduates
* knowledge professionals outside of academia
* business leaders
* public policy makers and planners
* journalists and bloggers
* public intellectuals and ‘think tanks’
* evidence-based campaigners and activists
* amateur historians
* teachers of students aged under 18
* advanced and ambitious students, age 14-18
* home schoolers and grassroots educators
* adjunct or associate university lecturers, seeking a substitute for lost paywall access during the long summer holiday
* university lecturers and students, seeking a straightforward search tool for full-text open access content
“A Google engineer has developed an algorithm that spots breaking news stories on the Web and illustrates them with pictures.”
Nearly 20,000 hi-res maps of America have been released under CC0 by The Lionel Pincus & Princess Firyal Map Division, The New York Public Library.
Here’s the press release, which has a download link to a free-registration download service containing the hi-res versions and also links to tutorials on how to use the service.
I had a quick look at the full list of Schema.org tags, which are now available in Google CSEs. They can be used to filter the CSE’s site list, serving to “Restrict pages from the above site list to only those that contain [chosen] Schema.org types”. Handy if you have a huge single site of HTML/CSS/XML that you can grep, and you want to prepare it for selective CSE search without having to juggle directories and file names.
It looks to me like those tagging open access scholarly articles would need to be able to chain Schema.org tags into something like…
CreativeWork: ScholarlyArticle: TransferAction: DownloadAction: GiveAction:
Whereas paywall publishers might need something like:
CreativeWork: ScholarlyArticle: TransferAction: DownloadAction: SellAction:
But at present there seems to be only the basic undifferentiated…
Even if there were workable OA additions to Schema.org, there would still the huge problems of: i) persuading people to add the tags to all their ongoing content at the article level, and to do so correctly and consistently; and ii) to have them go back and accurately tag perhaps two decades or more of existing open access articles.
From today, D&AD is offering free membership, which allows people access to an online archive of every ad to win a D&AD award [and] free copies of the D&AD annual
All the six-inch to the mile Ordnance Survey maps of Great Britain, 1842-1952. Now free, zoomable, and synced as geo-located historical series onto Google Maps.
It would be nice if they could get a system to extract all the keywords from the map lettering, rectify the (inevitably corrupted) keywords by fuzzy matching each of them against a standard historical gazetteer / place-name list for the area, then inject the hyper-linked names into each map’s page as keywords. That way the maps would be more easily searchable by keyword in Google Search. I’m not sure that’s even possible when old-style text is overlapping with graphical elements, as seen below, but it might be interesting to try…
The modern names of places can, of course, already be looked up. But “Gerrardsfold” for instance, seen above in Cheshire, can only take one to a “Gerrards Fold Barn” in Lancashire when using the service’s modern lookup gazetteer.
The academic world recently learned that bots can write automated gibberish and — with a little help from their fleshy minions — can have it published in mainstream peer-reviewed scientific publications. But are we prepared for what follows from the moment when bots can reliably produce writing that makes real sense and which is useful and timely enough for use in major newspapers? It’s happening already. The finances of newspapers are such that a wave of robo-journalism seems inevitable, once we have a few more advances in semantics and automated basic fact-checking. Given the current dismal state of newspaper science reporting such new-fangled robo-news may even be slightly better than what we have now.
It follows that journal editors and publishers may soon need to add a new clause to their author guidelines, such as: “articles must be fully written by humans”. Not for fear of gibberish faux-papers, but rather because bots will be able to add sensible summaries and otherwise usefully aid in the writing of a research paper. Or we may need to develop an agreed form of simple presentation to flag up:
[bot]this section of the text was written by bots[/bot] and to embed links to the bot’s sources.
Incidentally, I’ve also often thought that the humourous LOLcat language would form a pleasing basis for identifying messages-sent-to-humans by objects embedded in The Internet of Things, clearly marking their simpler forms of communications to us as being: ‘not kreated by th humanz’. We already have the LOLcat translation systems available.
aim[s] to identify, incubate and spin off into the commercial sector viable online applications based on the re-use of digital cultural heritage content [from Europeana, and] The best five applications will be invited to a final challenge event to pitch their ideas to representatives from the cultural and creative industries as well as to investors.
The current challenge has a Natural History theme. Deadline: 31st March 2014.
Springing to mind: a simple workflow for automated extraction and smoothing of 3D shapes from high-res 2D photos of organic shapes (shells, fossils, wings, insects carapaces, etc), to create a royalty-free bank of organic starting-point shapes for rapidly iterative and generative product design prototyping.
Chlamyphorus truncatus, via Europeana. The prototype for your new girls’ hairbrush has arrived… 😉
Routledge’s new Porn Studies journal wants to luv you short time…
“Porn Studies is open-access for a limited time only, so anyone can read all of the papers in the inaugural issue. Paper titles include, “Psychology and pornography: some reflections” and “Gonzo, trannys, and teens: current trends in U.S. adult content production, distribution and consumption.”
The major new Chinese Google-alike Chinaso, just launched, has a Chinaso Theory for those eager to march down “The socialist road with Chinese characteristics”, joyously waving their copies of the Theory of Party Building journal…
Completed the addition of URLs from the open ejournals in ornithology list. JURN now has excellent coverage of free and open ornithology journals.
“The Music Composed By An Algorithm Analysing The World’s Best Novels”. Researchers at the National Research Council Canada have used software to automatically measure…
“the way the emotional temperature changes throughout a novel, and then [have] automatically generated music that reflects these moods and how they evolve throughout the book.”
Although perhaps it’s worth remembering that Eno says he doesn’t bother to sync his ambient music and images these days — since he finds that people experience them as being synced together, even when they’re not.
I’ve overhauled the code that’s driving search and display for JURN, plugging in newer v2 CSE code and and wrangling in some new CSS. JURN should now be a little faster than before, while giving Google a little less server overhead.
Changes, as seen above:
1. A spiffy new graphical “Search” button to click. Next to it is an X to click, which clears the search and starts over again.
2. Removed the confusing and misleading “Found 565,000 hits in 0.4 seconds…” notifications. Google was never providing JURN’s users that many hits anyway. It was just that valuable computational time was not being spent finessing down the main index numbers for the benefit of curated Custom Search users.
3. The search results page links — found at the very foot of the search results as 1 2 3 4 5 6 7 8 9 10 — are now aligned left.
4. The faint dotted underline on links is now carried over onto the actual results links. Last week Google started taking underlines off search results altogether, though it was mostly tech-heads who spotted it being tested. For now, links are still underlined on the JURN results. But if underlines do get taken off Google links in the near future then I’d hope my faint dotted underline will remain to soften the blow for traditionalists.
5. A millisecond delay as the search-box loads, on first visiting the page.
I found a 2013 article from geoscientists who had tested Google Scholar: “Literature searches with Google Scholar: Knowing what you are and are not getting”. Although the body of the paper states that their test phrase was “wildfire-related debris flows”, the data shows they actually tested Scholar with the keywords wildfire-related debris flows. They reportedly found that…
“free articles were available in PDF format for 88% of citations returned by Google Scholar. They were available from open-access journals or via links to organizational sites where authors had posted their publications.”
However if you actually look at their linked search-results data file, then the above statement needs additional clarification. Since it’s clear that paywall articles from Elsevier, Springer and the like, appearing in their Scholar results, were being counted toward those “free articles”. It turns out that many of these were “free” only via a DigiTop proxy overlay for Scholar that is, in the words of DigiTop, “available to USDA employees only”. Nice if you work under the U.S. Department of Agriculture umbrella, but it seems that those outside have to pay.
Does Google Scholar perhaps need to add some kind of “paywall box detector” to its scraper bots? Then perhaps something like [PDF] [-||-] could be added on the right-hand column of the Scholar results, to indicate a PDF that’s “available maybe” — but which will prove to have a paywall that needs to be either backed out from or negotiated? And perhaps [PDF] [-~-] could indicate a genuine direct link to a bona fide PDF file?
Anyway… this is what geoscientists are talking about when they refer to wildfire-related debris flows. Seems like it might be a geological process that intelligent farmers, hiker-campers, and treeline homesteaders around the world would like to learn some precise details about…
Giant mudslides, basically.
Incidentally, the same wildfire-related debris flows search in JURN needs to be tightened up just a little for strong results. Using wildfire-related “debris flows” works better, though the first six pages of good results do stray just a little (to pick up what seem to be three articles about prehistoric ‘dinosaur-era’ debris flow events). Yet even on this test JURN appears to be doing about twice as well as Google Scholar in terms of getting open articles, once Scholar’s ‘false-positive’ paywall PDFs from Elsevier & co. are subtracted from Scholar’s results.
I found a fun 2013 article by Dorothea Salo, “How to Scuttle a Scholarly Communication Initiative”. Dorothea hilariously explores the festering tar-pits of institutional politics, amid which a fragile scholarly communication initiative is expected to bloom.
Paperpile has been reviewed by PC World magazine (4th March 2014). Paperpile is a browser-based competitor to Mendeley. It integrates tightly with Google services such as Google Scholar and Google Drive, and can also slurp academic PDFs “directly from Google search results”. I’d be interested to hear if it works with JURN. Once the found PDFs are in your Google Drive cloud storage, it’s reported that…
“Paperpile analyzes your papers and acquires all the necessary metadata by itself.”
Sadly it’s only for the Chrome browser, not Firefox. At present it seems to be just a personal workflow aid, since there’s no collective exposure of the found content to a single public search box (as is offered by Mendeley’s “Search papers” search box).
Most papers will be downloaded at speed, because they “seem they might be worth looking at later”. Yet if Paperpile were able to measure re-open rates, view duration and frequency, and the actual level of citation in a person’s finished project or work, then that would be an interesting basis for a bumping algorithm that could help power the results ranking in a public searchable catalog. Especially if Paperpile could broadly match or align your research interests with those of similar Paperpile users, in combination with a more standard citation analysis, to give you a tailored search experience. Although in practice I guess there would be huge and possibly unwanted feedback amplification loops generated by that approach, as search results could veer heavily toward the latest fashionable topics. Doubtless Google has this nailed down already, and there’s probably a Trendy Search Topic Surge Controller employed somewhere in the Googleplex.
“What were librarians thinking of?” A question I often ask myself, as I glance at various pointless and fruitless busy-work projects. But now there’s a new survey of the views of “academic library directors in the U.S.”, which gives some insight. Scholarly Kitchen has a handy digest of the report…
In 2010, 41% of library directors said that, if given a 10% budget increase, they would like to spend at least some of it on discovery tools. In 2013 only 16% said the same thing.