Joseph Esposito has usefully had a peek inside a very expensive commercial market report titled Global Social Science & Humanities Publishing 2013-2014.

Social/Humanities publishing is found to be perhaps 25% of the size of Science/Technology/Medicine, at around $5bn. That actually strikes me as something of an achievement, when you consider that we have far smaller research funding inputs and a smaller technical/training infrastructure to call on. But perhaps the $5bn figure is given a strong boost by teacher training textbooks, social work manuals and the like?

Joseph highlights the report’s finding of a highly fragmented market. This market fragmentation is one of the reasons I’m skeptical about the success of a ‘one metadata to rule them all’ solution to OA indexing and discovery. It seems that DOAJ-listed OA journal titles can’t even find their way in full-text into the largest of commercial databases (such as EBSCO Complete) at higher levels than just over 20%. When last heard of the Web of Science / Scopus seemed to be barely scraping 1,000 OA titles indexed. One art history study found that Google Scholar could index only half the DOAJ’s OA art history titles. A dastardly conspiracy to keep OA titles out of these big indexes seems unlikely. So I suspect it’s largely due to many OA editors in the arts and humanities not giving a fig about providing the means to automatically index their content. Their widespread lack of something as basic as RSS feeds seems to confirm that. Add to that the fact that only 56% of DOAJ journals can supply the DOAJ with article metadata. Persuading non-librarian types to do something as simple tag all their back-issue content with some simple new machine-readable OA tag thus seems rather a long shot. Persuading mainstream publishers to do the same? Well… maybe, but what’s their incentive for that? Even if they do, will they allow mass harvesting of the OA articles? Nor are librarians likely to be of much use, after the fact of publication — since they seem to have mostly failed to apply even their own metadata standards to open content, and open repository metadata quality is reported to be dire.