I would like to respond to a post, Ask Science Woman: How do I organize journal articles?, by Science Woman. I think this is a very important topic for all aspiring scholars. Science Woman’s advice is excellent. I have just a few suggestions to add.
I think my problem … the problem of organizing “offprints” (copies) of papers … may be an order of magnitude greater than many other scholars because of some opportunities I had while in graduate school. My first advisor, Glenn Isaac, had an incredible collection of African (mainly) archaeological (mainly) materials, and I was at a school that did not see a photocopy machine as a major resource to be protected by uniformed armed guards. So I photocopied his collection. Actually, there were a few of involved in this … we simply divided up the work and made multiple copies of each paper and redistributed them. We even took a few stacks to the copy center when we had extra cash that we could not spend on beer. This took about a year.
After Glynn died, my new adviser, Irv DeVore, was found to have about the same size collection with virtually no overlap. That one, I pilfered on my own because our photocopy club by then had dispersed and we were all in the field or elsewhere at different times. Simultaneously, I pilfered David Pilbeam’s photocopy collection, but more selectively (by the time, there was now a substantial overlap between my own collection and David’s)
Having spent very little time in classwork for my undergraduate degree (four credits worth to be exact) I took every course I could in graduate school … I found taking classes to be great fun … and every one of them required inspection and sometimes even careful reading of between fifty and a couple of hundred papers. The Borg known as my offprint collection absorbed them all. And I’ve continued this apace. My collection of papers numbers about 10,000 different articles stored in about 16 file cabinet drawers.
While pilfering various collections, and working with various senior colleagues, I was able to see how different systems work, and clearly, organizing papers by author is the only way to go, as Science Woman suggests. The sheer size of my collection, however, suggested a more efficient system than having each author having its own folder (with sub folders) as she does. I have a system that is much more efficient in terms of storage … both with respect to space and the work involved in putting articles away, but with a modest cost in efficiency in accress.
I have my files divided alphabetically with a series of folders for the first few letters of each last name. So, in theory, “Brooks, Brookings, Broohaha, and Broomhilda” are all in the file labeled “BROO.” If I wanted to also include “Brown, Browning, and Brozeski” I would instead have a folder named “BRO,” but that would have more papers in it. If I adjust the number of letters, I can keep the number of papers filed per folder to a manageable number. I can then find an article by searching through the small stack of papers with those last names. If I have a zillion papers by one author, I can give that author her own file, as in Brooks, Alison (lots of papers).
One might view this system as less than ideal, and perhaps it is, but with 10,000 papers it works for me.
Every single paper that is in my cabinets has two attributes: 1) It is in my database; and 2) it has either an “x” or a stamp (as in rubber stamp, of my name) on it. This way I know that if I am using an article and want to return it, if it does not have the stamp or x, I have to enter it into the database before I file it. (More about the database below.)
A few years ago, PDF files became standard, and several thousand papers are now in my collection in only electronic form.
As with Science Woman, I name the PDF files in a systematic way, though I don’t keep sufficiently up on it. My method is to use the last name of each author for up to two or three authors, followed by EtAl if there are more, and a year. I don’t bother with journal name, etc. This is usually sufficient. So, a PDF file that comes to me as:
Notice that there are no spaces in the filename. Spaces in filenames are evil.
Some of these PDF files are in the database, some are not. Instead of spending time on that, I’ve opted to have a more general index of the PDF files. I’m still playing around with this and making adjustments, but my current method is to use Beagle.
Beagle uses pipes and translators. A translator converts the contents of a PDF file (or pretty much any kind of file that you might want to index) into a text stream, which is then run through an indexer with the results added to an index. So you can find a PDF by Brooks and McBrearty by entering those two names into Beagle. This does not work for PDF files that are photocopies of papers rather than true PDF text files.
On the database: My original database was in dBase (II or III? Maybe IV? Can’t remember). Later, I translated this into Endnote.
Science Woman uses Endnote, as I did until I made the transition to Linux. Actually, I used Endnote on Linux, running Windows Endnote on Crossover, and that worked flawlessly. But, I now use Bibus. It was not difficult to import my Endnote file into Bibus. Bibus is an SQL database that imports from medline, etc. and will put properly formatted references into document files (I use Open Office) and so on. Just like Endnote but you can look under the hood, it is faster, and free. Since it is a MySQL database, you can also access the raw data directly and play with it that way if you want.
I plan on changing my database system to use my own SQL system, in the medium future.
The most important message from both Science Woman and me is this: Start now. Get a good system and use it early in your career or you will be swimming in chaos later.