Adventures in Ethics and Science

ResearchBlogging.org

In the last post, we looked at a piece of research on how easy it is to clean up the scientific literature in the wake of retractions or corrections prompted by researcher misconduct in published articles. Not surprisingly, in the comments on that post there was some speculation about what prompts researchers to commit scientific misconduct in the first place.

As it happens, I’ve been reading a paper by Mark S. Davis, Michelle Riske-Morris, and Sebastian R. Diaz, titled “Causal Factors Implicated in Research Misconduct: Evidence from ORI Case Files”, that tries to get a handle on that very question.

The authors open by making a pitch for serious empirical work on the subject of misconduct:

[P]olicies intended to prevent and control research misconduct would be more effective if informed by a more thorough understanding of the problem’s etiology. (396)

If you know what causes X, you ought to have a better chance of being able to create conditions that block X from being caused. This seems pretty sensible to me.

Yet, the authors note, scientists, policy makers, and others seem perfectly comfortable speculating on the causes of scientific misconduct despite the lack of a well-characterized body of relevant empirical evidence about these causes. We have plenty of anecdata, but that’s not quite what we’d like to have to ground our knowledge claims.


This is especially relevant for those of us who are supposed to teach the next generation of scientists how to conduct responsible research, since arguably a better understanding of what makes scientists’ conduct go wrong could be helpfully guide the instruction aimed at keeping future scientists from falling into the same errors. Davis et al. are not, however, arguing that all ethics training be halted until the full causal analysis of research misconduct has been completed:

Legions of new scientists are continually being trained, and it is reasonable to acquaint them with research norms and the consequences of their violation early in their training programs, regardless of whether ignorance of such norms actually underlies instances of research misconduct. (396)

I’m assuming this will come as a relief to my students this semester.

Anyway, Davis et al. are presenting an empirical study of the causes of scientific misconduct. Before describing the research they conducted, they describe the sorts of causes for misconduct that were alleged prior to this empirical research.

One of these is a flaw in the individual researcher committing the misconduct. They write:

Upon a finding of scientific misconduct, the respondent (as the individual accused of research misconduct is referred to by the ORI) is subject to a variety of consequences including debarment. So it is appropriate, although perhaps to some unduly reductionistic, for analyses of etiology to include the individual level of analysis. (396)

Reductionist or not, this is an explanation that the authors note received support even from a scientist found to have committed misconduct, in testimony he gave about his own wrongdoing to a Congressional subcommittee:

I do not believe that the environment in which I work was responsible for what I have done. Competition for limited research funds among research investigators is a necessary part of federally funded scientific work. Neither this, nor competition for major awards in science, can be implicated as an important factor in my particular instance. An honest investigator should be able to deal effectively with the traditional ‘publish or perish’ pressures… The loss of my ability to be an objective scientist…cannot…be linked to defects in the system under which I worked. (397)

In any case, identifying some feature of the bad actor — whether transient emotional or mental state, or personality (maybe having a large ego, extreme narcissism, or an unwavering belief in the truth of his or her hypotheses regardless of what the data might show) — as the cause of the bad act is part of the story that is sometimes told in the aftermath to make sense of acts of scientific misconduct.

Another theory is that bad actions are bad responses to difficult circumstances. In prior work, two of the authors of the current research catalogued situational factors identified by the bad actors themselves:

Mark Davis and Michelle Riske note that some of those who had been found guilty of scientific misconduct expressed that they had been experiencing family and other personal difficulties at the time of their involvement. These difficulties included, but were not limited to:

  • Loss of family members
  • New baby
  • Emotional difficulties due to a relationship breakup
  • Wife’s complicated pregnancy
  • Son diagnosed with Attention Deficit Disorder and Conduct Disorder
  • Parents’ disappointment over respondent not getting into medical school
  • After purchasing a new home, respondent’s salary was cut

There is evidence, then, that situational factors belong on the list of potential etiological factors underlying research misconduct. One has to wonder, though, whether these situational factors, much like mental and emotional problems, might be used by those who are caught as a means of avoiding responsibility for their own actions. (398-399)

Then there’s the possibility that it is the organizational factors and structural factors shaping the environment in which the scientific work takes place that push the bad actors to add badly. Organizational factors include issues like the nature of relationships between supervisors and underlings, while structural factors might include ways that scientific performance is evaluated (e.g., in hiring, promotion, or tenuring decisions, or in competitions for funding).

Before we press on here, I feel like I should put my cards on the table. I myself have a tendency to notice organizational and factors, and a history of suggesting we take them more seriously when we talk about responsible conduct of research. This has not been grounded in a large body of empirical research so much as in the fact that the folks near the top of the scientific food chain sometimes seem to me unwilling to examine whether such factors could make a difference — or to acknowledge that organizational and structural factors are not, in fact, immovable objects.

Am I wrong to focus on organizational factors? Am I right? We’ll see what this research has to say about that.

Finally, another hypothesis is that cultural factors may be causally connected to instances of misconduct. Davis et al. note a study of allegations of research misconduct or misbehavior (at a single research institution) that found foreign researchers made up a disproportional share of those accused. As well, they point to claims that foreign early-career researchers in the U.S. are more likely to feel obligated to include their scientific mentors in their countries of origin as guest authors on their own publications. They don’t note the claim I have heard but for which I have not seen much methodical empirical support that foreign-born scientists are operating with a different understanding of proper acknowledgment of prior work and thus might be more likely to plagiarize. Such an explanation, though, clearly turns on cultural factors.

Given these stories we tell in the aftermath of an instance of scientific misconduct about just what caused an apparently good scientist to act badly, Davis et al. set out to get some empirical data:

Specifically, this study is an attempt to identify the causes of research misconduct as perceived by those against whom a finding of scientific misconduct was made. In particular, this paper presents the results of a study using data extracted from ORI case files to identify the factors implicated in research misconduct. (400)

You’ll note that there may still be a gap between what the bad actor perceives as the causes of her bad act and what the actual causes were — people can deceive themselves, after all. Still, the bad actors probably have some privileged access to what was going on in their heads when they embarked on the path of misconduct.

Let’s talk methodology.

Davis et al. examined the “closed” cases of research misconduct (with a finding of misconduct against the accused) conducted by the Office of Research Integrity (ORI) as of December 2000. (The ORI came into existence in May 1992 as a successor to the Office of Scientific Integrity (OSI), so we’re talking about a period of about 8.5 years here.) The subjects here are not a random sampling of members of the scientific community. They are scientists accused and found guilty of misconduct. However, the researchers here are looking for empirical data about why scientists engage in the behaviors that fall under scientific misconduct, and I’m guessing it would be challenging to identify and study misbehaving scientists who haven’t (yet) been accused or convicted of misconduct “in the wild”, as it were.

The information about these subjects is constrained by the information included (or not included) in the ORI case files. Davis et al. didn’t collect demographic data (such as gender, age, or ethnicity) from the case files. And, they excluded from their analyses case files that “failed to yield information relating to etiology” (401). Out of the 104 case files the researchers reviewed, 12 were excluded for this reason.

How did Davis et al. extract data from these case files — case files that included the reports of university investigations before cases were passed up to ORI, transcripts of hearings, letters and emails that went back and forth between those making the charges, those being charged, and those investigating the charges, and so forth? They developed an “instrument” for data collection for researchers to use in reviewing the case files. The data collection instrument is a way to make sure researchers extract relevant bits of information from each file (like the nature of the misconduct claim, who made the accusation, how the accused responded to the charges, and what findings and administrative actions ORI handed down).

To make sure that the data collection instrument did what it was supposed to before they turned it to the case files under study, they did a “test drive” on 15 closed case files from OSI.

Here’s how Davis et al. describe the crucial bit of the data extraction, aimed at gleaning data about perceived causes of the subjects’ misconduct:

The first step in the data analysis process employed a strategy adopted from phenomenological research wherein the textual material is scanned for statements or phrases which could explain why the misconduct occurred or possible consequences as a result of the misconduct. Rather than searching for evidence of specific theories or propositions, the investigator examines the data more for explication than explanation.

Once the data were collected from the files at the ORI, two different coders extracted phrases that conveyed causal factors implicated in research misconduct. As a check against possible bias created by prior knowledge or other factors, the analyst extracted verbatim phrases rather than interpreted or paraphrased concepts. The second analyst approached the data in the same manner, identifying exact wording thought to convey possible causes of research misconduct. The statements or phrases pulled from the instrument were recorded on index cards. The two analysts then compared and reconciled their lists. Any discrepancies were resolved by the research team so that items were coded in a consistent fashion. (402)

Of course, the case files contained claims not just from the scientists found guilty of misconduct but also from the folks making the allegations against them, others providing testimony of various kinds, and the folks adjudicating the cases. Davis et al. note that at least some of these claims ought to be recognized as “hearsay”, and thus they decided to err on the side of caution rather than inferring any official judgment on the cause of misconduct in a particular case.

Once they had the stack of index cards with verbatim causal claims pertaining to the misconduct in each case file, they grouped those claims by concepts. Here are the 44 concepts they used:

  1. Pressure to Produce
  2. Cognitive Deficiency
  3. Inappropriate Responsibility
  4. Difficult Job/Tasks
  5. Poor Supervisor (Respondent)
  6. Professional Conflicts
  7. Stress/Pressure in General
  8. Stressful Job
  9. Supervisor Expectations
  10. Insufficient Supervision/Mentoring
  11. Non-collegial Work Environment
  12. Lack of Support System
  13. Substandard Lab Procedures
  14. Overworked/Insufficient Time
  15. Poor Communication/Coordination
  16. Competition for Position
  17. Insecure Position
  18. Pressure on Self/Over-Committed
  19. Desire to Succeed/Please
  20. Personal Insecurities
  21. Fear
  22. Poor Judgment/Carelessness
  23. Lack of Control
  24. Impatient
  25. Jumping the Gun
  26. Frustrated
  27. Laziness
  28. Apathy/Dislike/Desire to Leave
  29. Personal Problems
  30. Psychological Problems
  31. Character Flaw
  32. Language Barrier
  33. Restoring Equity
  34. Recognition
  35. Avoid Degradation
  36. Denial of an Injury
  37. Lie to Preserve the Truth
  38. Public Good Over Science
  39. Lost/Stolen/Discarded Data
  40. Denial of Negative Intent
  41. Reliance on Others/Permission
  42. Slippery Slope
  43. Amnesia
  44. Condemnation of the Condemner

(Davis et al. call these concepts covering attributions of causation “factors implicated in research misconduct.”) They also classified whether the causal claims about the misconduct were being made by the respondent to the misconduct charges (“This is what made me do it”) or by someone other than the respondent explaining the respondent’s behavior.

Davis et al. then analyzed this data:

To explain patterns in the data, multidimensional scaling and cluster analysis was employed. The combined use of these techniques is borrowed from the Concept Mapping/Pattern Matching (CMPM) methodology. Concept mapping is a type of structured conceptualization which can be used by groups to develop a conceptual framework which can guide evaluation or planning.

Although reliability for CMPM has been well-established, its calculation departs from conventional test theory in which there are either correct or incorrect answers. Because these do not exist for CMPM, reliability focuses on the consistency of the maps produced as opposed to the individual items. (402)

My familiarity with CMPM is only slight, and instances where I have seen it used have tended to be higher education leadership workshops and things of that ilk. Davis et al. explain some of the ways they adapted this methodology for use in their research:

A more conventional use of the CMPM methodology would involve preparing a research or evaluation question, and then gathering a group of stakeholders to identify individual items that address that question. For example, if this study were conducted in a fashion consistent with most CMPM studies, the investigators would have convened a group of stakeholders who are experts on research misconduct, and then asked these individuals, ‘What are the factors or causes that lead to research misconduct?’ This study deviates from that conventional approach, a deviation we believe enhances the objectivity of the CMPM process. Rather than asking experts to identify via a focus group those factors associated with research misconduct, evidence from the ORI case files was used to identify codes that help explain research misconduct. (403)

Similarly, Davis et al. didn’t ask experts (or bad actors) to sort into meaningful stacks the 44 concepts with which they coded the claims from the case files, then take this individual sorting to extract an aggregate sorting. Rather, they let the case files generate the meaningful stacks — the subset of 44 concepts that covered claims made in a particular case file were counted as being in a stack together. Then, the researchers used those case file-generated stacks (along with multidimensional scaling and cluster analysis) to work out the aggregate picture of how 44 concepts are associated.

Now, onto their findings.

First, you’re probably interested in the broad details of the 92 closed cases they examined. The most common cases in this group involved findings of falsification (39%) or fabrication and falsification (37%), with plagiarism making a healthy showing as well. The respondents to the charges included assistant professors (12%), associate professors (13%), full professors/ department heads (9%), graduate students (12%), postdocs (13%), and technicians or research assistants/associates (24%). (17% of the sample respondents didn’t fit any of those classifications.) As far as the degrees held, the respondents included M.D.s (16%), Ph.D.s (38%), and M.D./Ph.D.s (7%), as well as respondents without either of these degrees (22%). For 17% of the respondents, the case files did not provide information on respondents’ level of education.

What did the case files offer as far as what could have caused the misconduct in the particular cases? Davis et al. write:

The average number of explanations for research misconduct identified in a particular case file was approximately 4 (mean = 3.8, s.d. = 3.0, range 1-15). The frequency with which individual explanations for research misconduct were identified among all case files ranged from 1 to 47 times (mean = 11.8, s.d. = 10.8). (405)

In other words, there was no single case file in which all 44 of the factors implicated in research misconduct were implicated — at most, a single case file pointed to 15 of these factors (about a third of the entire set). Some of the factors in the list of 44 were only cited in a single case, while others were cited in multiple cases (including one cited in 47 cases, more than half of the 92 cases analyzed).

The researchers generated plots and matrices to identify how the various factors implicated in research misconduct coincided in these 92 case files — which ones seemed frequently to travel together, and which ones were hardly ever cited in the same case. Potentially, the factors that repeatedly coincide, seen as “clusters”, could be understood in terms of a new category that covers them (thus reducing the list of factors implicated in research misconduct to a number less than 44).

Davis et al. identified seven such clusters in their analysis of the data. Let’s look at how the factors ended up clustering (and the labels the researchers used to describe each cluster) and then discuss the groupings:

Cluster 1 — Personal and Professional Stressors:

8. Stressful Job
9. Supervisor Expectations
12. Lack of Support System
14. Overworked/Insufficient Time
17. Insecure Position
18. Pressure on Self/Over-Committed
19. Desire to Succeed/Please
20. Personal Insecurities
22. Poor Judgment/Carelessness
29. Personal Problems
30. Psychological Problems
36. Denial of an Injury
40. Denial of Negative Intent

Cluster 2 — Organizational Climate Factors:

6. Professional Conflicts
10. Insufficient Supervision/Mentoring
11. Non-collegial Work Environment
13. Substandard Lab Procedures
15. Poor Communication/Coordination
39. Lost/Stolen/Discarded Data
41. Reliance on Others/Permission
44. Condemnation of the Condemner

Cluster 3 — Job Insecurities:

3. Inappropriate Responsibility
5. Poor Supervisor (Respondent)
16. Competition for Position
32. Language Barrier

Cluster 4 — Rationalizations A:

23. Lack of Control
25. Jumping the Gun
37. Lie to Preserve the Truth

Cluster 5 — Personal Inhibitions:

4. Difficult Job/Tasks
26. Frustrated

Cluster 6 — Rationalizations B:

21. Fear
28. Apathy/Dislike/Desire to Leave
33. Restoring Equity
35. Avoid Degradation
42. Slippery Slope

Cluster 7 — Personality Factors:

24. Impatient
27. Laziness
31. Character Flaw
34. Recognition
38. Public Good Over Science
43. Amnesia

Note that the analysis yielded two distinct clusters of rationalizations the accused might offer for misconduct.

Cluster 1 seems to cover the publish-or-perish stressors (and everyday situational challenges) through which scientists frequently have to work. Cluster 2 encompasses factors related to the structure of larger organizations and the group-level interactions within them. Davis et al. describe Cluster 3 as relating more to the scientist’s perception of his or her job security or individual response to normal work pressures. (It may well be, though, that the normal work pressures of the research scientist are somewhat different from normal work pressures in other fields.) Clusters 4 and 6 both capture rationalizations offered for misconduct. Cluster 5 identifies two factors connected to the individual’s response to workplace stressors, while Cluster 7 seems to cover personality flaws that might undermine responsible conduct of research.

What can we conclude from these results? Does scientific misconduct happen because of bad people, or because of situations that seem to leave researchers with a bunch of bad choices? Again, given that the researchers are analyzing perceptions of what caused the cases of misconduct they examined, it’s hard to give a clean answer to this question. Still, Davis et al. argue that the case files that provide their data were worth examining:

One unique contribution of this study is that it made use of attributions found in actual case files of research misconduct. Data from cases in which individuals were found to have committed scientific misconduct offer insights different from other methodologies such as surveys that call for subjects’ opinions on why research misconduct occurs. This research was limited in that it only examined information contained within the case files for individuals who have had a finding of research misconduct by ORI. Nevertheless, these data help to further understanding of research misconduct, especially why those involved in it believe it occurs. Future research might explore causal factors implicated in cases in which research misconduct was alleged but not found by ORI. Also of interest would be instances of research misconduct investigated by administrative bodies other than the ORI. (411)

(Bold emphasis added.)

Knowing why people acted the way they did (or at least, why they think they acted the way they did) might be useful in working out ways to keep people from behaving like that in the future. Some of this may turn on helping individuals make better choices (or doing a better job of screening out people with personality factors that make bad choices far too likely). Some of it may involve changing organizational and structural factors that make the better choices too difficult to put into action and the worse choices too tempting.

The authors here note that there are clear implications for effective strategies as far as responsible conduct of research (RCR) instruction — namely, that talking about the causal factors that have been implicated in actual cases of misconduct may focus needed attention on strategies for dealing with work stressors, weakness of will, or whatever factor threatens to turn a good scientist into a cheater. They also note that this could be useful information as far as developing better employee assistance programs for research staff, helping researchers to manage scientific workplace stressors rather than crumbling before them.

So, at the end of this research, there is no smoking gun, no single identifiable cause responsible for these cases of scientific misconduct. Possibly what this means is that there are multiple factors that can (and do) play a role.

As such, the prospects for a silver bullet that might eliminate all scientific misconduct don’t look good. However, to the extent that data from real (rather than merely hypothetical) cases might give a better picture of where acts of misconduct come from, more of this kind of research could be helpful.

_______

Davis, M., Riske-Morris, M., & Diaz, S. (2007). Causal Factors Implicated in Research Misconduct: Evidence from ORI Case Files Science and Engineering Ethics, 13 (4), 395-414 DOI: 10.1007/s11948-007-9045-2

Comments

  1. #1 Don Monroe
    March 29, 2010

    Thanks for the very interesting summary. I’ve always found the glib, confident attributions of motives for misconduct to ring hollow. The one that seems to be cited most often in the general news is the dollar value of the grants, which I think misses most scientists’ motivations by a mile.

    Still, although this is a good thing to look into, I think it’s more important to limit the consequences of misconduct. To continue the medical metaphor, it may not help that much to know the etiology of the disease, if we can’t prevent it. But we still want to know how to treat it, to minimize the damage it causes, even if we can’t prevent it.

    For scientific misconduct, the worst damage arises from pollution of the literature by erroneous results (although some of these will always arise through honest error). The most important thing that can help reduce these effects is the healthy and skeptical engagement of collaborators, who are the only ones who can really know what’s going on in the lab. I suspect the primary barrier to such skepticism is the feeling that it is a violation of the trusting relationship to even consider the possibility that one’s collaborator is misbehaving.

  2. #2 Anonymous
    March 30, 2010

    To me, most of the “concepts” piled by the authors from the ORI misconduct cases read as a list of excuses that kids produce when caught with their hand in the cookie jar. Once caught, the main effort by the “criminal” is to rehabilitate his/her name through minimizing their own personal responsibility. This list of “concepts” and their clusters is exactly that, a list of excuses that minimize personal responsibility. That’s why we cannot find among these “concepts” even one that reads: “I started cheating in grade school by plagiarizing on take-home exams. I was good at it then and I have perfected my methods of falcifying and fabricating data over the years, which prevented me from ever being caught. I cannot believe I was caught this time.”

    My point is, most fraudsters in science have done it before and simply got away with it. Whether or not the tendency to cheat is a character flaw or a learned behavior, psychologists could probably come up with a relatively simple test that would flag potential cheaters.

  3. #3 Comrade PhysioProf
    April 3, 2010

    My direct knowledge of a decent number of misconduct cases leads me to the following theory that covers the majority of these cases (but not, of course, all).

    (1) Those who commit misconduct do not start out as nefarious schemers intentionally seeking to subvert the system.

    (2) Trainees who commit misconduct work under the mentorship of desk-bound PIs.

    (3) The seeds of misconduct are planted when a trainee brings fresh new honestly obtained preliminary data to the PI, and the PI gets really excited, effusively praises the trainee, poses a provocative hypothesis based on the data, and then sends the trainee back out to confirm/follow-up/build-upon the preliminary data and verify the hypothesis.

    (4) Those seeds are watered when the trainee fails to confirm the preliminary data, explains that to the PI, and the PI expresses disappointment, asserts that something must have been wrong with the second set of experiments (and not the first), and sends the trainee back out into the lab to try again.

    (5) The tree of misconduct germinates when the trainee at this point starts to cherry pick data that supports the hypothesis and garners praise from the PI. At first, this cherry picking may even be arguably legitimately justifiable on grounds ostensibly independent of whether those data support the hypothesis or not.

    (6) The PI sees this set of data that supports the hypothesis (but not the data that excludes it) and begins to feel more and more strongly that the hypothesis is correct, and no longer even gives lip service to the possibility that the initial findings were a fluke or mistake and the hypothesis bogus. The roots are beginning to take hold.

    (7) The PI and the trainee are now mutually vested in the truth of the hypothesis, and the trainee–perhaps due to some level of weakness of character or will–feels locked in, and physically unable to present the PI with unbiased data that would exclude the hypothesis. Buds are forming.

    (8) The PI gets more insistent with the trainee that it should be possible to obtain clear, convincing, unambiguous data proving the hypothesis to be correct. The trainee finally succumbs to the pressure that has built up very gradually over time, and frankly fakes some data. The tree has flowered.

    (9) Once that line has been crossed by the trainee, there is no turning back, and all of the incentives from that point forward make it far preferable to fake more data than to tell the truth. Full-blown large-scale data fakery ensues.

  4. #4 Dan Hicks
    April 4, 2010

    Perhaps I missed something or know much less about epidemiology/etiology than I think I do, but I don’t understand the methodology here.

    First, there’s no control group here. One oversimplified but straightforward and common way of trying to detect causation is by looking for factors that satisfy a conditional probability inequality:

    P( misconduct | controlled-variables & factor ) > P( misconduct | controlled-variables & not-factor )

    But if P( misconduct ) = 1 (because every individual in your sample committed misconduct) then this inequality is trivially false.

    Then, second, looking at correlations between the purported factors doesn’t tell you anything more than, eg, if someone’s given #8 in their deposition or whatever then they’re likely to also give #9. It doesn’t tell you, for example, how prevalent any of these factors or clusters are among individuals convicted among misconduct. If everyone cites an item from cluster 3 and only a few people cite an item from cluster 1, say, there’s some reason to look more closely at job insecurity than personal and professional stressors in future studies.

    I do think they’ve done a fine job of developing a preliminary taxonomy of possibly relevant factors. But it isn’t anything more than that. It’s not even a preliminary taxonomy of *actually* relevant factors.

  5. #5 Anonymous
    April 4, 2010

    Wow, for comment #3. Decent number (n=1 or 2)? Although it is refreshing to read a long and detailed comment by CPP without even a hint of profanity, I wonder how the real CPP would respond to a comment like that (#3) if written by someone else. I also find it interesting that the imaginery PI seems to be the real culprit in CPP’s scenario of a developing case of scientific misconduct. As if the poor trainee is just an immature child who succumbs to unbearable pressure by a PI who’s desk bound and doesn’t know or care what’s happening in his/her own lab.

  6. #6 Comrade PhysioProf
    April 4, 2010

    I also find it interesting that the imaginery PI seems to be the real culprit in CPP’s scenario of a developing case of scientific misconduct.

    Congratulations! You can fucking read!

    Clarification: The theory isn’t about “culprits”; the theory is one of causality.

  7. #7 Anonymous
    April 4, 2010

    “Congratulations! You can fucking read!”

    Now, this is vintage CPP.

    “Clarification: The theory isn’t about “culprits”; the theory is one of causality.”

    Nevertheless, you still claim that the PI is the cause of the trainee’s misconduct and you know that this is BS. Are all your trainees first-graders?

  8. #8 JF
    April 8, 2010

    I think there are really only three causes:
    1) A lack of integrity,
    2) A lack of responsibility, and/or
    3) A lack of communication.

    The rest are just excuses.