I’ve expanded my guide to how to build and install your own self-hosted linked Google CSE, and published it as a new book…
A four-part survey on current semantic enterprise startups.
Google Books reveals the number of books catalogued by their service: 129,864,860, by their count. There’s some interesting information about how they de-duplicated and cleaned the data.
But how many have more than a fragmentary (and usually mis-placed) “snippet view”? At 2008, Google and its partners had scanned seven million books, and made one million available in “full preview”. Another one million were then also available in “full preview”, because they were public domain. Even assuming that we now have access to 3 million books that we can “preview” on Google Books, there’s still a long way to go to get the other 126,864,860 books online. Although it might be an easier task when we remove the non-English works and fiction from that total.
A new book series in English from Brill of the Netherlands, Scholarly Communication : past, present and future of knowledge inscription.