Google Scholar banned in China

Paul Stapleton, associate professor at the Hong Kong Institute of Education, writes in the South China Morning Post today that… “China must unblock Google Scholar”

… it is curious to note what has happened recently on the mainland [China]. Google Scholar is no longer available there.”

It’s a weak article but at least it’s made me aware that Scholar, as well as the main Google Search, had been blocked in mainland China. Looking back through the surprisingly sparse western news reports, I see that the Chinese national block reportedly began in earnest on 29th May 2014. The New York Times reported in September 2014 “China Clamps Down on Web, Pinching Companies Like Google”

… blocking virtually all access to Google websites [from 29th May 2014 onwards, and … ] the block has largely remained in place ever since. […] Jin Hetian, an archaeologist in Beijing […] said. “When in China, I’m almost never able to access Google Scholar, so I’m left badly informed of the latest findings.”

Back in January 2015 The New York Times reported “China Further Tightens Grip on the Internet”

In recent weeks, a number of Chinese academics have gone online to express their frustrations, particularly over their inability to reach Google Scholar, a search engine that provides links to millions of scholarly papers from around the world. [there is now an energy-sapping] unending scramble to find ways around website blockages… “

An April 2015 Forbes article “How The Great Firewall Prevents China From Becoming A World Education Power” failed to mention Scholar, but the journalist (visiting Shanghai at the time) opened by reminding readers that…

all things Google are all blocked [in China]”

The reason for the ban appears to be ideological. The respected Index on Censorship had an article “Return of the Red Guards: the risks faced by students and teachers criticising the government line in China” in their June 2015 issue, that opened…

Since Xi Jinping came to power nearly three years ago, China has witnessed an intense campaign against anyone who criticises the party. Recently this campaign has moved into universities and sought to muffle both teachers and students alike. […] In January 2015, the Chinese leadership released guidelines that said universities must prioritise ideological loyalty to the party, the teaching of Marxism and Xi Jinping’s ideas. In the days following this announcement, education minister Yuan Guiren announced to a room of leaders from several prominent universities that the use of Western textbooks would be restricted and any that promote “Western values” would be banned. […] “By no means allow teaching materials that disseminate Western values in our classrooms,” Yuan told the gathering. “Never allow statements that attack and slander party leaders and malign socialism to be heard in classrooms.”

Presumably Google Books is also blocked in China.

Flip the classroom: a survey of some magazines on Issuu

This is my one-off selective survey of some journals and substantial magazines available via as free flipbooks, at November 2015:

Archive : the journal of the Leslie-Lohman Museum of Gay and Lesbian Art


AR[t] : augmented reality, art and technology


B/AS : journal of dress practice


Berkley Review of Latin American Studies


Bonefolder : an ejournal for the book binder and book artist (‘best of’ compendium as an ebook)


British Journal of Photography (substantial but partial previews)


Catalan Historical Review and many other Catalan journals from


Cornell Journal of Architecture and Cornell AAP


Eye Magazine and other IAFOR titles


Explorations : The Texas A&M Undergraduate Journal


Fire Ecology


Graduate Journal of Food Studies


Horizonte : journal of architectural discourse


Humanities (National Endowment for the Humanities magazine)


Illumination : the Undergraduate Journal of Humanities


International Journal of Wilderness


Jewish Museum Berlin journal


Journal of Interdisciplinary Studies in Sexuality


Kinfolks Quarterly : a journal of black expression


Medical Humanities Journal of Boston College


On Site


Perspective (the highest end of the movie industry, journal of the Art Directors’ Guild) (also has an easier PDF index)


Planum : the Journal of Urbanism


UCSC Jewish Studies Newsletter


WWB and related fashion industry magazines.


Issuu at 25 million

Once upon a time, a creative seeking contemporary visual inspiration might trawl a university library’s new journal shelves. Now there are 25 million magazines online at Issuu, free-to-read and in handy flip-book format. The art / design / fashion section of Issuu is a fascinating insight into what editors have cared enough about to produce a proper designed and curated magazine for. Issuu is especially good for fashion students, with middleweight industry magazines such as WWB and MWB publishing on Issuu…


The highly curated nature of many Issuu magazines means that spotting subtle forward cultural trends may be easier than amid the manic/nostalgist jumble of Pinterest.

Issuu’s native search box isn’t great, at least for the sort of one-word searches its users are likely to use. “academic” plunges one into a wasteland of old university course catalogs and redundant student guides. “Scholarly” is only marginally better.

Sadly the flipbook format, of medium res jpgs + javascript, is usually unfriendly to Google Search. Unless you’re the editor of the National Endowment for the Humanities’ Humanities magazine or the Medical Humanities Journal of Boston College, both of whom obviously know how to extract the fulltext and invisibly add it to each issue’s flipbook page.



Many other publishers, such as Texas Wildlife magazine, don’t seem to know how to get their magazine’s text indexed by Google. Even if they did, one suspects that it may not be the greatest search experience — searching against a whole run-on block of text, that’s been auto-extracted from a PDF.


What about getting a date-ordered single page of a title’s issues? Publishers on Issuu can at least, if they wish, produce ‘stacks’ from their magazine issues. Japan’s International Academic Forum (IAFOR), for instance, does this here. But their sub-stack for IAFOR journals is rather lumpen and fails to present ordered runs of their titles. Luckily, in their case, the issues are also online in PDF.

The similar jumbling of dates at Issuu for the Illumination: the Undergraduate Journal of Humanities stack suggest that re-ordering titles by publication date, rather than by upload date, may not actually be possible? Just my guess.

Regrettably Issuu doesn’t let Google index*/stacks/ Or perhaps the Googlebot just didn’t feel the need to do so?

Occasionally one gets only a truncated preview at Issuu, such as the venerable British Journal of Photography or the previews of the V&A Museum books, but most titles appear to be complete. Sometimes a little too complete, hem hem. Issuu might add a useful icon or two: indicating ‘sample only’ and ‘flagged by users as possibly pirated’.


Issuu’s somewhat mainstream tilt nicely complements, a similar well-established free magazine flipbook site that veers more towards the indie fashion / artzine / perzine end of the spectrum. The print-focussed MagCloud store, until recently owned by HP, also has a modest ‘free to read online’ section


Creative students and faculty will still benefit from grabbing an armful of current print magazine from the library shelves — especially the elite photography, architecture/design, fashion and fine art publications. But thanks to Issuu and their ilk now the rest of us can also have a similar experience, albeit minus the likes of Vogue, Aperture and similar.

Google New

A long article in TIME magazine this week on Google’s roadmap for voice recognition / voice-controlled services on everyday platforms — such as phones, wrist-bands, smartcars, and perhaps even dolls and fridges. Robot cats, even. Sadly, TIME remarks that…

At the product meetings where Google plans out the future of its search products, the desktop is rarely discussed.”

Let’s hope that’s because their ‘new product’ marketeers are confident that desktop keyword search is still being steadily advanced, and by the world’s best techies, somewhere far below them in the Google-bunkers.

Source IP

Source IP is a new single hub from which to search across Australia’s current commercial patents. It’s different from the usual dull type of patent search. SIP search results make it easy to identify patents available for sale, and also what’s out there but is already owned (by potential competitors). Each search result goes to a business-friendly page with a picture, the patent abstract, expiry dates, and inventor contact details.


Use MS Excel 2007 to split a long column / list into smaller chunks

How to use MS Excel 2007 to split a long column or list into smaller chunks, for later batch processing:

Real world scenarios: You have a simple but huge list that you want to parcel/email out in equal portions to various project participants. Or you are working with an old form-based system that can only process X amount of items at a time.

1. Get the excellent free ASAP Utilities plugin, install it in Excel.

2. Open a new sheet and paste your long list down into a single column.

3. In your new ASAP Utilities tab, click the Select button.

4. ASAP’s Select gives you a list of choices before it runs. Choose option 2 (“Conditional Row and Column Select…”) and then use the dialog box that appears. Here I’ve opted to have ASAP tell Excel to select every 25th cell…


5. Run Select, then exit the dialog box. The cells won’t immediately look like they’ve been selected. But if you Ctrl + C to copy them, then the familiar “marching ants” will reassuringly appear around the selected cells.

6. Now right-click your mouse anywhere inside your new group of selected of cells, and choose ASAP Utilities | option 18 “Insert before and/or after each cell in your selection…” In this new dialog choose “Insert after” and type {lf} to add a new blank line inside each of your selected cells.

7. Run the Insert process. It may take a minute to run, on a long list. Each selected cell will be given a double height by adding a line-break, thus…


If you just need to print out an Excel spreadsheet with each list-chunk separated by a space, perhaps so that your manager can easily read through the list in printed form, then you can leave the process there.

8. Some may now want to go further. When the whole column is selected and copied out to Notepad, you will see that the 25th, 50th, 75th etc cell will appear in quote marks “”, thus…

item 24
“item 25”
item 26

That’s kind of useful, but not really — since the primitive Notepad can’t handle multi-line search/replace.

However, simply paste the same list into the free open-source Notepad++ and the list copies as…

item 24
“item 25

item 26

9. That’s perfect. So now we just use Notepad++ to search all the occurrences and replace them with blanks. Then we have our list in chunks of 25 — each nicely separated by a blank line.

10. The neatly chunked list can now be pasted back into Excel, adding real blank cells between each chunked section. You might then add a comma to each blank cell, thus giving a basic comma-delimited .csv file for use with automated mailing-list software and similar.

Or the list can simply be saved out of Notepad++ as a plain .txt list, to work with manually — in clearly defined batches of 25 at a time.

The advantage

Post your articles to as soon as they’re published, get more citations….

Based on a sample size of 34,940 papers, we find that a paper in a median impact factor journal uploaded to receives 41% more citations after one year than a similar article not available online, 50% more citations after three years, and 73% after five years. We also found that articles also posted to had 64% more citations than articles only posted to other online venues, such as personal and departmental homepages, after five years.” [the conclusion expands this “other” element, it includes: “journal site, or any other online hosting venue”]

The studied papers were uploaded at “the same time they’re published”. Excluded from the study were… “articles uploaded to after they were published”.

Amazingly, the authors also note that…

To our knowledge there has been no research on what features of open access repositories or databases make articles easier to discover”

All that public money spent on repositories around the world, and not one librarian has felt the need to test for such public discoverability vectors? Seriously?


A new Canadian commercial start-up is offering its new oaFindr service, with free / low-cost trials for university libraries. oaFindr is said to be able to explore a library’s existing journal subscriptions, and to identify just the open access articles within the hybrid journals. According to the press release oaFindr…

… enable[s] academic institutions to analyze their journal subscriptions and provide[s] them with a reliable, precise search and discovery tool to retrieve all open access articles. This solution will also help them comply with governmental open access mandates, and support them in rapidly increasing the diffusion of their institutions’ scholarly production in a manner that is much less labour-intensive”

The idea appears to be that the discovered OA articles are then harvested and passed to the company’s related oaFoldr service, with oaFoldr providing a conduit into their hosted repository for the OA articles. Nice if it works and gets adopted and, if public, it would provide a welcome new mega-repository for Google and JURN to index. Alternatively, I suppose that the oaFoldr may just be a private folder for cataloguers, in which the articles reside before being placed into the university’s own repository. More likely to be the latter, since otherwise one commercial company could potentially get to corral the world’s OA article output in its own repository, and would then be in a position to sell it back to universities via an enhanced search and mining/metrics service.

Regrettably, as Bernard Rentier observes, mass extraction and archiving of 1000s of OA articles per month from commercial databases may not be welcomed by the big publishers…

Elsevier has designed a way to prevent researchers from mass-downloading articles from its website where they are so-called open access…”

So how would universities harvest efficiently? Bear in mind that commercial licenses may also prevent a university from taking the proprietary hybrid journal metadata from the likes of Elsevier, Springer, Oxford etc, along with their OA fulltext PDFs. So I guess it’s much more likely that each institution will play safe and harvest only PDF articles by their own researchers, thus giving a much lower harvesting volume that might not trigger download blocking. And that they’ll find ways not to take any metadata generated around the OA article by publisher databases.

I wonder if some large institutions may have to harvest articles via spoofing multiple ‘student’ accounts? Or is oaFindr itself pre-harvesting OA PDFs from hybrid journals and then vending them to institutions along with metadata? Probably not, or the big publishers would likely be throwing lawsuits at the company. oaFindr seems more likely to be a sort of super-Paperity, but covering all hybrid titles from the big publishers plus all the DOAJ titles at the article level. I’m guessing a lot here, or course, but if such a service works then it would be rather cool. Though probably lacking in things like Google-strength semantics and relevance ranking.

So let’s assume that the university libraries are the ones that do the work of harvesting OA PDFs for their repositories. OA mandates and the consequent exponential growth of OA articles may still lead to the hitting of a ‘mass downloading’ roadblock in the near future, even at a university which restricts itself to its own outputs and/or harvests fulltext via multiple accounts. Big publishers might even change their database small-print, so as to forbid ‘type targetted’ mass harvesting leading to local storage of articles.

I guess one solution would then be to rely only on having repository records + Web links to the fulltext (fulltext hosted back on the journal’s website). Though that assumes that links don’t break. Which they do, and at a horrendous rate.

In the end I suspect it may just be easier for a university to go after its research staff with pitch-forks, and literally force them to upload their OA papers to the university repository. If your new paper isn’t in the repository after 28 days, then your next month’s salary gets docked 20% and your department can’t apply for any new funding or external partnerships in the next six months. That sort of thing.


Get every new post delivered to your Inbox.

Join 885 other followers