Use MS Excel 2007 to split a long column / list into smaller chunks

How to use MS Excel 2007 to split a long column or list into smaller chunks, for later batch processing:

Real world scenarios: You have a simple but huge list that you want to parcel/email out in equal portions to various project participants. Or you are working with an old form-based system that can only process X amount of items at a time.

1. Get the excellent free ASAP Utilities plugin, install it in Excel.

2. Open a new sheet and paste your long list down into a single column.

3. In your new ASAP Utilities tab, click the Select button.

4. ASAP’s Select gives you a list of choices before it runs. Choose option 2 (“Conditional Row and Column Select…”) and then use the dialog box that appears. Here I’ve opted to have ASAP tell Excel to select every 25th cell…


5. Run Select, then exit the dialog box. The cells won’t immediately look like they’ve been selected. But if you Ctrl + C to copy them, then the familiar “marching ants” will reassuringly appear around the selected cells.

6. Now right-click your mouse anywhere inside your new group of selected of cells, and choose ASAP Utilities | option 18 “Insert before and/or after each cell in your selection…” In this new dialog choose “Insert after” and type {lf} to add a new blank line inside each of your selected cells.

7. Run the Insert process. It may take a minute to run, on a long list. Each selected cell will be given a double height by adding a line-break, thus…


If you just need to print out an Excel spreadsheet with each list-chunk separated by a space, perhaps so that your manager can easily read through the list in printed form, then you can leave the process there.

8. Some may now want to go further. When the whole column is selected and copied out to Notepad, you will see that the 25th, 50th, 75th etc cell will appear in quote marks “”, thus…

item 24
“item 25”
item 26

That’s kind of useful, but not really — since the primitive Notepad can’t handle multi-line search/replace.

However, simply paste the same list into the free open-source Notepad++ and the list copies as…

item 24
“item 25

item 26

9. That’s perfect. So now we just use Notepad++ to search all the occurrences and replace them with blanks. Then we have our list in chunks of 25 — each nicely separated by a blank line.

10. The neatly chunked list can now be pasted back into Excel, adding real blank cells between each chunked section. You might then add a comma to each blank cell, thus giving a basic comma-delimited .csv file for use with automated mailing-list software and similar.

Or the list can simply be saved out of Notepad++ as a plain .txt list, to work with manually — in clearly defined batches of 25 at a time.

The advantage

Post your articles to as soon as they’re published, get more citations….

Based on a sample size of 34,940 papers, we find that a paper in a median impact factor journal uploaded to receives 41% more citations after one year than a similar article not available online, 50% more citations after three years, and 73% after five years. We also found that articles also posted to had 64% more citations than articles only posted to other online venues, such as personal and departmental homepages, after five years.” [the conclusion expands this “other” element, it includes: “journal site, or any other online hosting venue”]

The studied papers were uploaded at “the same time they’re published”. Excluded from the study were… “articles uploaded to after they were published”.

Amazingly, the authors also note that…

To our knowledge there has been no research on what features of open access repositories or databases make articles easier to discover”

All that public money spent on repositories around the world, and not one librarian has felt the need to test for the public discoverability vectors? Seriously?


A new Canadian commercial start-up is offering its new oaFindr service, with free / low-cost trials for university libraries. oaFindr is said to be able to explore a library’s existing journal subscriptions, and to identify just the open access articles within the hybrid journals. According to the press release oaFindr…

… enable[s] academic institutions to analyze their journal subscriptions and provide[s] them with a reliable, precise search and discovery tool to retrieve all open access articles. This solution will also help them comply with governmental open access mandates, and support them in rapidly increasing the diffusion of their institutions’ scholarly production in a manner that is much less labour-intensive”

The idea appears to be that the discovered OA articles are then harvested and passed to the company’s related oaFoldr service, with oaFoldr providing a conduit into their hosted repository for the OA articles. Nice if it works and gets adopted and, if public, it would provide a welcome new mega-repository for Google and JURN to index. Alternatively, I suppose that the oaFoldr may just be a private folder for cataloguers, in which the articles reside before being placed into the university’s own repository. More likely to be the latter, since otherwise one commercial company could potentially get to corral the world’s OA article output in its own repository, and would then be in a position to sell it back to universities via an enhanced search and mining/metrics service.

Regrettably, as Bernard Rentier observes, mass extraction and archiving of 1000s of OA articles per month from commercial databases may not be welcomed by the big publishers…

Elsevier has designed a way to prevent researchers from mass-downloading articles from its website where they are so-called open access…”

So how would universities harvest efficiently? Bear in mind that commercial licenses may also prevent a university from taking the proprietary hybrid journal metadata from the likes of Elsevier, Springer, Oxford etc, along with their OA fulltext PDFs. So I guess it’s much more likely that each institution will play safe and harvest only PDF articles by their own researchers, thus giving a much lower harvesting volume that might not trigger download blocking. And that they’ll find ways not to take any metadata generated around the OA article by publisher databases.

I wonder if some large institutions may have to harvest articles via spoofing multiple ‘student’ accounts? Or is oaFindr itself pre-harvesting OA PDFs from hybrid journals and then vending them to institutions along with metadata? Probably not, or the big publishers would likely be throwing lawsuits at the company. oaFindr seems more likely to be a sort of super-Paperity, but covering all hybrid titles from the big publishers plus all the DOAJ titles at the article level. I’m guessing a lot here, or course, but if such a service works then it would be rather cool. Though probably lacking in things like Google-strength semantics and relevance ranking.

So let’s assume that the university libraries are the ones that do the work of harvesting OA PDFs for their repositories. OA mandates and the consequent exponential growth of OA articles may still lead to the hitting of a ‘mass downloading’ roadblock in the near future, even at a university which restricts itself to its own outputs and/or harvests fulltext via multiple accounts. Big publishers might even change their database small-print, so as to forbid ‘type targetted’ mass harvesting leading to local storage of articles.

I guess one solution would then be to rely only on having repository records + Web links to the fulltext (fulltext hosted back on the journal’s website). Though that assumes that links don’t break. Which they do, and at a horrendous rate.

In the end I suspect it may just be easier for a university to go after its research staff with pitch-forks, and literally force them to upload their OA papers to the university repository. If your new paper isn’t in the repository after 28 days, then your next month’s salary gets docked 20% and you can’t apply for any new funding or external partnerships in the next six months. That sort of thing.

Walk the British Museum in Google StreetView

Oh, how wonderful. Now you can walk the floors of the British Museum, via Google Streetview, and get close-ups of 4,500 artefacts. No more trudging for miles through hordes of tourists, with nowhere to sit down except in the cafes…

Built over 15 months with the help of a Google employee with a camera on wheels [and] completed by the Google Cultural Institute after hours, with special light-bulbs being installed to ensure the lighting remained the same through the galleries. The results can now be used by members of the public, academics who wish to study objects in detail from home, or teachers, who are being encouraged to “bring their lessons to life” through the resources.”

Also very useful for visitors who are only ever going to get one pass at an in-person visit, and who want to learn the layout of the place first in order to maximise their time at the Museum.




Added to JURN

Journal of Creative Music Systems

Hedgehog Review, The (archives)

Jesuit Higher Education : A Journal


Ontario Archaeology

Insights : the UKSG journal

Chestnut Grower, The

Bioscan, The (poorly presented, but it is in the DOAJ)


The Dictionary of Art Historians is now back in JURN, after an URL shift had knocked it out for about six months or so.

Thanks also to JournalTOCs for the recent crop of open access U.S. university dept. law journals.

Extreme right

The EU’s “right to be forgotten” ruling is now blocking access to historical Holocaust archives, reports the Jerusalem Post

Researchers across the continent – especially in Sweden, France and Germany – have claimed that archivists have begun restricting access to data, citing the GDPR as their rationale for not complying with requests for documents. Because the legislation does not stipulate how long after a person’s death his or her private information can be revealed, or when access to such information can be granted, some archivists “have begun reading into what they understand the law will be,” and are “barring access to materials, including materials [related to] the history of the Holocaust,” Dr. Robert Williams said.”

For the birds….

My openECO A-Z listing of journals has now had all known free bird titles added to it, with help from the open ejournals in the Ornithology Exchange list and the British Trust for Ornithology list of open ejournals. All the new bird journal URLs were closely checked before being added to the A-Z, since those older lists have a lot of linkrot and even some mis-attribution of OA status.

Some figures on OA in Web of Science

Thomson Reuters on OA in Web of Science

The Web of Science [has] more than 12% of its core collection database in Open Access journals, many with direct, full-text links to open content (see Figure 1).”

oa-titles-wos-2015Fig. 1: “Open Access Titles in the Web of Science by Discipline & Geography” (SOURCE: Thomson Reuters Web of Science)

The total of 72 OA titles in arts & humanities (all in English?) is comparable to Scopus. Scopus had 60 OA arts & humanities titles in English at June 2015, a fact discoverable via their new OA tagging. Though, after sorting, that Scopus category also included such tres arty titles as Canadian Journal of Speech-Language Pathology, Journal of Biomedical Discovery and Collaboration and Asian Social Science.


Get every new post delivered to your Inbox.

Join 884 other followers