There have been a number of piercing calls for training of data professionals (of various stripes) in the last year or so. Schools of information have been answering: Illinois, North Carolina, others.
Honestly, I’m getting a sinking feeling in my stomach. If I were to label it, the label would go something like “where are all these newly-minted data professionals going to work?”
My stomach sinks worse when I realize that quite a few of the calls are coming from the same people and organizations who uttered piercing calls for the establishment of institutional repositories in the early 2000s. Libraries did as they were bid; the results were at best mediocre (and that’s a generous assessment). The callers have not, to the best of my knowledge and belief, acknowledged any error in the call they made, much less any of the waste and damage caused. So? we’re going to trust these same people on a similar leap into the half-known?
The larger question is how we move data professionals into the research enterprise. It’s an analogous question to others that have surfaced in libraries: moving librarians into the classroom to teach more than Booleans, for example. We’ll hear some of the same things from the people we want to help: “a solution in search of a problem,” notably, as well as “how can you possibly understand my research if you’re not just like me?”
(My answer is what it’s always been: “I don’t have to understand your specific data to tell you that keeping data on CD-ROMS in a shoebox under your desk is a bad idea.”)
I’ve seen one answer I like: internships. GSLIS at Illinois moves its data-curation students into data-related internships once they graduate. They beat the bushes for research organizations looking for the kind of help their graduates provide. In so doing, they ease their people into jobs, raise the profile of their program, and raise the profile of information professionals as research partners generally. This is smart business. I go further: I believe it wholly irresponsible to have a data-curation instruction program targeted at librarians and information professionals without such an internship program.
Training scientists is another question, of course; I don’t think it’s quite as necessary to do internships in the well-accepted informatics fields. It probably can’t hurt, though.
Grant funders: I’d like to see some bribes happening. Make money available to grantees to hire on data professionals. The wording of such grants will be tricky?you don’t want them hiring just another developer?but I’m sure you can do it. Likewise, fund the internships I just described! Finally, any research you can fund that demonstrates good outcomes from the presence of data professionals can only help.
Institutions: I don’t know; I truly don’t. Some days I believe that data management can only happen on the level of the individual research lab. Some days I believe that data can only survive if institutions tackle the problem. Some days I believe both, and my head hurts.
We all of us need to avoid some obvious pitfalls, however. The maverick-manager pitfall familiar to libraries from the IR disappointment is one: data curation for an entire research institution cannot become the exclusive purview of one or a handful of supposed data professionals, especially when they have no budget, no developers, one server at most, and no institutional network.
Flooding the job market is another. Data professionals will lose what little credibility among researchers we have if dozens of us wind up applying to every open job. That leads to perhaps the shortest road to deprofessionalization in history! Let’s not do it. One way to avoid it may be to bite the bullet about incoming qualifications: perhaps we need to sigh and say “no science BA, no enrollment in this program; MAs and Ph.Ds in science preferred.”
That slams the door on me, incidentally, and I wouldn’t be happy about that. But if it means that newly-minted professionals have obvious job-market value, then that’s what we have to do.
Finally, let’s not get quite so exercised yet about who does what work; we risk “I stubbed my toe! Call in a specialist!” syndrome. Let’s focus on the work to be done. Work has a marvelous way of getting done, when it has to be, even by people who aren’t “professionals” and don’t have “professional” training. I am not a professional programmer. I don’t have the least hint of a degree in computer science, software engineering, or anything else. I still write code, because the code won’t write itself. If similar processes are how data curation turns out to happen, that’s fine with me.
Not least because then I won’t have “professional” doors slammed in my face.