GUI DEiXTo 2.9.6 (Jan 2012), free Windows software for scraping records from older repositories that don’t use OAI-PMH. The English User Manual is here.
Advertisements
23 Monday Jan 2012
GUI DEiXTo 2.9.6 (Jan 2012), free Windows software for scraping records from older repositories that don’t use OAI-PMH. The English User Manual is here.
DEiXTo allows you to scrape data from websites of interest and export it in a wide variety of formats. But in order to produce an OAI-PMH feed you will need programming skills and custom code that will transform the metadata extracted into the desired form. However, at deixto.com they can assist you with this challenging task since they have already helped some significant Greek online (non OAI-PMH) digital libraries/ collections to get included in Europeana as well as openarchives.gr through scraping and repurposing their rich content.
Thanks Kostas. Could one not just do this… 1) scrape to CSV in basic Dublin Core; 2) import CSV to Omeka; 3) expose Omeka records as an OAI-PMH repository? Then you would have an OAI-PMH feed.
Thank you David for bringing Omeka into my attention, I was not aware of it. Its CSV import plugin sounds very promising and overall Omeka seems a remarkable tool. As far as CSV is concerned, yes, you could extract bits of interest with DEiXTo and save them in CSV format (either in one, single file or multiple files if necessary). So, once scraping is finished and you have the raw data at hand (in a format suitable for further processing), it’s up to you how you will transform it into OAI-PMH (either through Omeka or via a customized script or another way). Therefore, if I am not missing anything, the combination of DEiXTo with Omeka could potentially help non OAI-PMH digital collections to export and distribute their content in OAI-PMH format (with all the advantages this brings).