• Directory
  • FAQ: about JURN
  • Group tests
  • Guide to academic search
  • JURN’s donationware
  • openEco: nature titles indexed

News from JURN

~ search tool for open access content

News from JURN

Monthly Archives: September 2015

Placing text

29 Tuesday Sep 2015

Posted by David Haden in How to improve academic search

≈ Leave a comment

A fascinating and very clearly written April 2015 article about automatically mining geolocation points out of plain text: “Mapping Words: Lessons Learned From a Decade of Exploring the Geography of Text”…

In Fall 2014 I collaborated with the US Army to create the first large-scale map of the geography of academic literature and the open web, geocoding more than 21 billion words of academic literature spanning the entire contents of JSTOR, DTIC, CORE, CiteSeerX, and the Internet Archive’s 1.6 billion PDFs relating to Africa and the Middle East, as well as a second project creating the first large-scale map of human rights reports. A key focus of this project was the ability to infuse geographic search into academic literature…”

We probably need a name for such activities, and also for mining eco/geo data out of old paintings and photographs of landscapes. Geo-mining is too 20th century and eco-unfriendly. Geo-gleaning and Geo-gleaner are terms that have a certain poetry about them, while also suggesting both the curatorial and the imprecise nature of the techniques.

Google Scholar and grey literature

28 Monday Sep 2015

Posted by David Haden in Academic search, JURN's Google watch, Spotted in the news

≈ Leave a comment

Interesting new paper at PLOS One, “The Role of Google Scholar in Evidence Reviews and Its Applicability to Grey Literature Searching”.

Test searches were drawn from review papers…

“…chosen as they covered a diverse range of topics in environmental management and conservation, and included interdisciplinary elements relevant to public health, social sciences and molecular biology.”

… and compared alongside Web of Science results…

Surprisingly, we found relatively little overlap between Google Scholar and Web of Science (10–67% of WoS results were returned using searches in Google Scholar using title searches).

Unsurprisingly, Google Scholar wasn’t found to be the one-stop shop many assume it to be…

… some important evidence was not identified at all by Google Scholar … [so it] should not be used as a standalone resource in evidence-gathering exercises such as systematic [literature] reviews.”

Interesting finding also that…

“Peak” grey literature content (i.e. the point at which the volume of grey literature per page of search results was at its highest and where the bulk of grey literature is found) occurred [in Google Scholar] on average at page 80 (±15 (SD)) for full text results … page 35 (± 25 (SD)) for title [search] results.”

So this suggests that one might usefully flick through to result 700 (of 1000) and work a few hundred results starting from there, if seeking grey literature with a very well-formed topic search? By well-formed I mean the sort of sophisticated literature-review style of search term chaining being used in this study, for example…

“oil palm” AND tropic* AND (diversity OR richness OR abundance OR similarity OR composition OR community OR deforestation OR “land use change” OR fragmentation OR “habitat loss” OR connectivity OR “functional diversity” OR ecosystem OR displacement)

It appears that the researchers only auto-extracted “citation records” from the search results, and then classified into broad categories based on those alone. There appears to have been no checking as to the validity of the link, and/or downloading and scrutiny of PDFs. So there are no measurements of how many of Google Scholar’s links work or lead to free no-paywall fulltext articles.

Lastly, I noted…

Google Scholar has a low threshold for repetitive activity that triggers an automated block to a user’s IP address (in our experience the export of approximately 180 citations or 180 individual searches). Thankfully this can be readily circumvented with the use of IP-mirroring software such as Hola (https://hola.org/)”

Has it leaked?

25 Friday Sep 2015

Posted by David Haden in My general observations, Spotted in the news

≈ Leave a comment

Has it leaked? is a rather nice specialist search tool for free content, from Sweden. Focussed on forthcoming arty music albums, it basically saves fans the task of tracking down the tracks / snippets / “making of…” etc that the official marketeers ‘leak’ for free in advance of the album, or during the release window. It’s not a pirate site, though, and firmly states: “No download links are allowed!”.

hasitleaked

I’d say there’s room in the market for something similar for all quality non-fiction books, perhaps in partnership with a book-summary service like Blinklist, and with user-configurable topic filters.

Why would such a site be needed? Here’s an instance of the limited way in which current mega-services offer to group versions or offer preview options. If one looks at Amazon UK for the new Matt Ridley book The Evolution of Everything: How New Ideas Emerge one only sees two options there for the audiobook: free with an Audible direct-debit subscription, or a £30 pre-order and wait until November for delivery. Even then the audiobook pages are not linked from the print book page, so someone landing on the print page via Web search would have no clue there even was an audiobook version. No mention at all on Amazon UK that it’s actually available now for £13 on the Audible UK site, or that there’s a free 13 minute extract of the introduction of the audiobook available via publisher on SoundCloud. Only my deep searching surfaced the free audiobook extract.

The above suggests that two mega-services (Amazon and Audible) and a mega-publisher (Harper) can’t even co-ordinate promo material and version offers for a major book in the globally important UK market. So I’d say there’s a lot of scope for savvy curators to do it for them, also adding author podcast links, newspaper book review links etc.

DuckDuckGo testing #2

24 Thursday Sep 2015

Posted by David Haden in JURN tips and tricks

≈ Leave a comment

I did a quick experiment in making a Custom Search Engine via DuckDuckGo‘s link-chaining feature. In this experiment I enable a search across a small group of reputable crowdfunding services, via this search in DuckDuckGo. The search format is…

"open access" site:patreon.com,gofundme.com,peerbackers.com,mysherpas.com,wedidthis.org.uk,crowdcube.com,cofundos.org,indiegogo.com,rockethub.com,kickstarter.com

Works fine. WordPress.com refuses to embed an active link that contains “a phrase” (it’s the inverted commas, presumably), but this test link should work.

Unfortunately chaining a list of URLs appears to turn off DuckDuckGo’s intitle: search modifier, at least when searching for a phrase. But intitle: does work when using a single keyword, in a search such as…

intitle:journal "open access" site:patreon.com,gofundme.com,peerbackers.com,mysherpas.com,wedidthis.org.uk,crowdcube.com,cofundos.org,indiegogo.com,rockethub.com,kickstarter.com

A keyword / phrase that veers more into popular culture (such as Lovecraft) seems to cause Kickstarter results to swamp the search results.

I also noted that the search results from the above example fail to distinguish between “open access” and “open-access”. Adding +, as in +”open access”, fails to force a verbatim search. There is obviously some slight wiggle-room in DuckDuckGo’s claim that they don’t try to second-guess your search terms. Google has the same problem with a verbatim that is-not-really-verbatim.

There’s no sort-by-date filter on the search results, and adding the search modifier sort:date to the search causes a chained-URLs search to totally fail.

Sadly a list of chained URLs just doesn’t work with DuckDuckGo’s Image Search. For instance, a searcher can’t constrain Image Search thus…

"cute cat" site:flickr.com,deviantart.com,commons.wikimedia.org

When looking for Creative Commons images using DuckDuckGo Image Search a better strategy is probably simply to dispense with the URL chain and use this…

"cute cat" "some rights reserved" OR "cute cat" commons attribution -noncommercial

This will still pick up “noncommercial” CC pictures on Flickr (since Flickr obfuscates the picture’s license behind a “some rights reserved” generality), but at least you’d be headed in the right direction. Note that it seems that DuckDuckGo only lets you use a single minus sign to knock out one keyword from the search, and it has to be at the end of the search to work.

A “Region” filter doesn’t appear to work on Image Search. You can’t just see the “cute cats” of Japan, for instance.

cats

DuckDuckGo testing #1

23 Wednesday Sep 2015

Posted by David Haden in JURN tips and tricks

≈ Leave a comment

First finding from my DuckDuckGo search testing. That site: is not at all a reliable indicator of what is indexed, when using an extended URL. For instance, the PDFs of the Joint Nature Conservation Committee, UK…

site:http://jncc.defra.gov.uk/pdf/

One lone result in DuckDuckGo. However, search for…

“The Vascular Plant Red Data List for Great Britain”

And up it pops at…

http://jncc.defra.gov.uk/pdf/pub05_speciesstatusvpredlist3_web.pdf

So the PDFs at http://jncc.defra.gov.uk/pdf/ are in there then, but it seems they can only be surfaced in DuckDuckGo by using…

site:jncc.defra.gov.uk filetype:pdf

AdBlock Browser launches

23 Wednesday Sep 2015

Posted by David Haden in Spotted in the news

≈ Leave a comment

The Adblock Browser has launched for mobile devices (Android and iOS). DuckDuckGo is their default search-engine.

How to delete a fulltext PDF from ResearchGate

22 Tuesday Sep 2015

Posted by David Haden in JURN tips and tricks, Spotted in the news

≈ Leave a comment

This may possibly be handy for some people. How to remove your fulltext PDF from ResearchGate, but leave the record standing. Finding the way to the delete function doesn’t seem very intuitive…

Historical ecology and art history

20 Sunday Sep 2015

Posted by David Haden in Spotted in the news

≈ Leave a comment

A fine short blog post by Manu Saunders on the historical ecology data latent in art history.

bush-fire-between-mount-elephant-and-timboon-1857Picture: 1857 bushfire near Timboon, Victoria, Australia.

Tree of Life

20 Sunday Sep 2015

Posted by David Haden in Spotted in the news

≈ Leave a comment

Tree of Life, a rough first-try at merging the available data on the relationships of the 2.3 million known and named species on Earth…

“According to a survey of more than 7,500 phylogenetic research papers published between 2000 and 2012, only one out of six studies came with a digital, downloadable format of the data. … Many of the evolutionary trees that have been published are only available as PDFs and other image files that can’t be entered into a database or merged with other trees.”

DuckDuckGo Image Search

20 Sunday Sep 2015

Posted by David Haden in JURN tips and tricks

≈ Leave a comment

DuckDuckGo’s Image Search is now a very pleasant experience in terms of relevancy ranking, a year after introduction of the images service in the summer of 2014. Speedy, too.

Of course it lacks Google’s useful filters for Creative Commons and image-size, but CC can be approximated in DuckDuckGo by adding the keywords Commons and Attribution to one’s search — and DuckDuckGo doesn’t seem to distort such a search by also trying to finding synonyms.

Nor does adding the word commons mean that it get confused into searching for pictures of the ‘heather and hawks’ type of natural heathland commons.

Such an approximation of CC appears to work quite well. And in such cases (‘find large-size CC images’) DuckDuckGo doesn’t appear to have a major handicap compared to Google, since both search engines seem to cover the same mega-services such as Wikipedia and DeviantArt etc.

Flickr is a special case, since the relevant keywords aren’t there — one would do better to use search.creativecommons.org for a thorough CC Flickr search.

duckduck

← Older posts
Subscribe: RSS News Feed.
I'm on Patreon!

JURN:

  • JURN : directory of ejournals
  • JURN : main search-engine
  • JURN : openEco directory
  • JURN : repository search

Related sites:

  • 4 Humanities
  • Academic Freedom Alliance
  • Accuracy in Academia
  • Alliance Defending Freedom
  • ALPSP
  • alt.academy
  • AMIR
  • Anterotesis
  • Arcadia project
  • Art Historicum (German)
  • AWOL
  • Beall's List (updated at 2018)
  • Beall’s List (old)
  • Beyond Search
  • Bibliographic wilderness
  • Booktwo
  • Campus Reform
  • Charleston Advisor
  • Coalition for Networked Information
  • Communia (public domain watchdog)
  • Cost of Knowledge
  • Council of Editors of Learned Journals
  • Dan Cohen
  • Digital Koans
  • Digital Shift
  • Dissernet (Russian anti-plagiarism)
  • DOAJ
  • Don't Block TOR
  • eFoundations
  • EIFL
  • Electronic Frontier Foundation
  • ELO
  • Embargo Watch
  • ePublishing Trust for Development
  • Facebook: Arab Open Access
  • Facebook: Italian Open Access
  • Facebook: Open Access India
  • Film Studies for Free
  • FIRE
  • Flaky Academic Conferences
  • Found History
  • Foundation for Individual Rights in Education
  • Free Speech Union (UK)
  • Google Algorithm
  • Heterodox Academy
  • Iconclass
  • IFLA Serials blog
  • ImpactStory
  • infoDocket
  • InTech Blog
  • Jinfo (formerly Free Pint)
  • Kindle blog
  • L'edition Electronique (French)
  • La Criee : periodiques (French)
  • Leader Statement Database on Free Speech
  • National Association of Scholars
  • National Coalition of Independent Scholars
  • Neil Beagrie
  • OA Lookup : Policies
  • OA Working Group
  • OASPA
  • Online Searcher
  • Open Access Bibliography
  • Open Access Week
  • Open and Shut?
  • Open Electronic Publishing
  • Open Folklore
  • Open Knowledge Maps
  • Open Library of Humanities
  • Periodiques en ligne (French)
  • Peter Murray Rust
  • PKP / OJS
  • Project Gutenberg
  • Publishing Archaeology
  • RBA Blog
  • Reclaim the Net
  • Research Information
  • Research Remix
  • Right to Research
  • River Valley TV
  • ROARS (Italian)
  • Scholarly Electronic Publishing
  • Scholarship Matters
  • Searchblox
  • Searcher
  • Serials Cataloger
  • Serials Review
  • Society of Young Publishers
  • Speech First
  • TaxoDiary (taxonomies news)
  • Taxpayer Access
  • Tentaclii
  • The Scholarly Kitchen
  • Thoughts from Carl Grant
  • Web Scale Discovery
  • Zotero blog

Some of the libraries linking to JURN

  • Boston College Libraries
  • Brooklyn Public Library, NY
  • Duke University
  • Kobe University, Japan
  • Rhode Island College
  • San Jose State University
  • UConn Stamford
  • University of California
  • University of Cambridge (Casimir Lewy Library)
  • University of Cambridge (main)
  • University of Canberra
  • University of Toronto
  • Washington University
  • West Virginia University

Spare BitCoins? Please send donations to JURN via: 17e2KGuyzjzEEE7BsoYTwMo3MtUod6DrjP

Archives

  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • February 2012
  • January 2012
  • December 2011
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • June 2011
  • April 2011
  • March 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009

Create a free website or blog at WordPress.com.

  • Follow Following
    • News from JURN
    • Join 901 other followers
    • Already have a WordPress.com account? Log in now.
    • News from JURN
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...