Preventing Plagiarism.

By jstemwedel on March 31, 2010.

Especially in student papers, plagiarism is an issue that it seems just won't go away. However, instructors cannot just give up and permit plagiarism without giving up most of their pedagogical goals and ideals. As tempting a behavior as this may be (at least to some students, if not to all), it is our duty to smack it down.

Is there any effective way to deliver a preemptive smackdown to student plagiarists? That's the question posed by a piece of research, "Is There an Effective Approach to Deterring Students from Plagiarizing?" by Lidija Bilic-Zulle, Josip Azman, Vedran Frkovic, and Mladen Petrovecki, published in 2008 in Science and Engineering Ethics.

To introduce their research, the authors write:

Academic plagiarism is a complex issue, which arises from ignorance, opportunity, technology, ethical values, competition, and lack of clear rules and consequences. ... The cultural characteristics of academic setting strongly inï¬uence students' behavior. In societies where plagiarism is implicitly or even explicitly tolerated (e.g. authoritarian regimes and post-communist countries), a high rate of plagiarism and other forms of academic dishonesty and scientiï¬c misconduct may be expected. However, even in societies that ofï¬cially disapprove of such behavior (e.g. western democracies), its prevalence is disturbing. (140)

Here, there is some suggestion of potentially relevant cultural factors that may make plagiarism attractive -- and not the cultural factors I tend to hear about here in California, on the Pacific Rim. But maybe we can extend Tolstoy's observation about how each unhappy marriage is unhappy in its own way to recognize the variety of cultural contexts that spawn dishonest students.

And this is not just a matter of the interactions between students and teachers. Bilic-Zulle et al. point to plagiarism in school as something like a gateway drug for unethical behavior in one's professional life -- so potentially, reducing academic dishonesty could have important consequences beyond saving professors headaches.

In any case, the big question the researchers take on is how to reduce the prevalence. Is it effective to emphasize the importance of academic integrity, or to threaten harsh penalties if plagiarism is detected?

This study focused on medical students at the Rijeka University School of Medicine in Croatia. The medical program there included an assignment required of all second-year medical students in which they were asked to write an original essay based on one of four articles selected by their professor. The researchers examined how the instructions given to the students affected the rate of plagiarism in the essays the students submitted. As well, the researchers focused on word-for-word plagiarism, since that's the least ambiguous variety of plagiarism.

The researchers had done previous studies drawing on the second-year cohort in the 2001/2002 and 2002/2003 academic years. For the essay assignment, the second-year medical students in 2001/2002 were asked to write an original essay about one of the four designated articles. In contrast, the second-year medical students in 2002/2003 were asked to write an original essay about one of the four designated articles and were also given an explanation of what plagiarism is and warned not to commit it in their essays.

The essays each group submitted were examined with the plagiarism detection software WCopyfind. The average plagiarism rate of the two second-year cohorts was 19% -- with no significant difference between the group educated about and warned against plagiarism and the group who had received no explicit instructions to avoid plagiarism.

Explaining what plagiarism is and telling students not to do it, in other words, seemed not to make a differences in what students actually did.

Bilic-Zulle et al. then extended that research to look at another cohort (the second-year medical students in the 2004/2005 academic year) to see whether a different warning to students might make a difference.

Here are some details:

During the mandatory course in Medical Informatics, the students were required to write an essay containing 250-1500 words based on a published scientiï¬c article. They could choose among the four articles written in Croatian, of which two were available only in hardcopy format, and two were available in electronic format and posted at the Medical School's website. The topics of one article in hardcopy format and one article in electronic format were less complex, whereas the topics of the other two articles were considered more complex. Complexity of the topics was estimated by the instructor, who also explained the content of each article during the Medical Informatics course. Students were allowed to use of additional literature sources for their essay. (141)

I'm going to pause here and note that, with the type size I normally use, 250-1500 words comes to one to five double-spaced pages. These are not tremendously long essays -- not 50 page tomes where students might be expected to be desperate to find extra words.

The number of students in each cohort in the study ranged from 87 to 111 -- so these are not huge samples, but they're not tiny either. All three cohorts of students were offered the same selection of four articles on which to write their essays. Like the 2002/2003 cohort, the 2004/2005 cohort was explicitly warned not to commit plagiarism (and given the explanation of what counts as plagiarism). Unlike the 2002/2003 cohort, the 2004/2005 cohort was told that their essays would be checked for plagiarism by the plagiarism detection software. The promised penalty for being caught plagiarizing (which I assume was promised to all three cohorts) was that the instructor would not verify their regular attendance of the Medical Informatics course until they submitted a properly written essay that was not plagiarized. (Such instructor verification was necessary for the students to take the final exam, which was itself necessary for the students to pass this required course.)

Worth noting here is that the plagiarism detection software was being used to determine whether the students were committing word-for-word plagiarism from the source article on which they were writing their essay. In other words, they were not checking for instances of plagiarism from other sources (e.g., published papers in the literature that might have responded to a source article, or contributed something in the same general area). Nor were they checking for students plagiarizing the work of other students, something that might plausibly happen in an assignment in a required course that used the same set of four source articles year to year.

I am not sure that the admonition to the students that software would be used to check for plagiarism was clear in conveying the limited scope of this check. (We'll get back to this issue in a moment.) But I have to say, if your instructor provides source articles, explains plagiarism, and then tells you not to plagiarize, plagiarizing from the provided source articles strikes me as pretty dumb. Isn't there a good chance that the instructor who provided those source articles is pretty familiar with what's in them?

Anyway, the students submitted electronic versions of their essays, and the researchers did the analysis for plagiarism:

[T]he body text of each essay was compared to the appropriate source article by using WCopyï¬nd plagiarism detection software. The comparison rules were set in accordance with both the software's author recommendations and available published data respecting ''the six words rule''. Portions of the text consisting of six consecutive words that matched exactly six consecutive words in the source article were considered to be plagiarized. Proportion of the text copied from the source article was calculated from the number of copied words (in strings of six or more consecutive words) and total number of words. Figures and Tables were excluded from analysis due to software's inability to compare content other than text. All essays were manually (visually) controlled by the investigators to ensure that properly quoted text was not counted as plagiarized text. However, no properly referenced direct quote was found in any of student essays. (142)

If the students in the sample were including direct quotes in their essays, that last observation makes me sad.

For the purposes of their analysis, Bilic-Zulle et al. counted the papers with 10% or less of their word count found by the software to be plagiarized from the source article as "not plagiarized", and those with more than 10% of their word count flagged by the software as "plagiarized".

Here's the table from the paper with the results:

i-1207aebd04b8aa5dc5be68188afc8584-Bilic-ZulleTable1.jpg

The authors note that while there wasn't a significant difference in the median proportion of plagiarized text between the 2001/2002 cohort (given no special warning about plagiarism) and the 2002/2003 cohort (given an explanation of plagiarism and a warning not to do it), there was a significant drop in the median proportion of plagiarized text in the essays of the 2004/2005 cohort that got the warnings that their essays would be run through the plagiarism detection software (down to 2% compared to 17% or 21%).

They also note that the essays of the 2004/2005 cohort were significantly shorter. One wonders whether giving up on expressing complex ideas in one's own words (and instead copying those words from someplace else) leads to verbosity.

Another trend over the three cohorts studied was that each successive cohort chose a higher proportion of source articles that were available in electronic form rather than just hardcopy, and that each successive cohort had a higher number of students electing to write their essays on more complex topics rather than simpler ones. It's not clear what, if anything, these trends have to do with the different warnings the three cohorts got about the originality of their essays. However, as the authors note, these trends tend to weigh against claims that the availability of electronic documents (and of keyboard shortcuts to cut and paste text) is to blame for plagiarism.

In the 2001/2002 cohort, a full 66% (73 students) turned in essays that crossed the researcher's "plagiarized" threshold (more than 10% of the words in the essay copied verbatim, in strings of 6 or more words). The 2002/2003 cohort that received the stern warning not to plagiarize still had 66% (57 students) turning in essays that crossed this threshold. However, in the 2004/2005 cohort that received a warning not to plagiarize and the information that software would be used to scan their papers for plagiarism, only 11% (10 students) turned in papers that met the researchers' definition of plagiarized.

The students, in other words, seemed largely to take seriously the power of the plagiarism detection software.

In case you're curious about the proportion of papers in each cohort with only "a little plagiarism" (i.e., 1-10% of the words in the essay copied verbatim, in strings of 6 or more words), those were 25 % for 2001/2002, 26% for 2002/2003, and 51% for 2004/2005. This means that the totally clean essays accounted for only 9% of the 2001/2002 total, 8% of the 2002/2003 total, and 35% of the "scared straight" 2004/2005 total .

This is, as the authors note, a high prevalence of plagiarism.

As mentioned above, the software the researchers used was only checking the student essays for instances of plagiarism that involved verbatim copying from the source article. Other software tools like Turnitin compare the essays they test against a much larger pool of sources, including other articles in the literature and other student papers that have been submitted to Turnitin. However, the researchers noted that tools like Turnitin tend to be set up to check against sources in English. To detect Croatian-language plagiarism, their scope would not actually be so powerful.

I'm inclined to think, though, that with a sufficiently vague warning ("We will be using software tools to check your essays for plagiarism") and without firsthand knowledge of the limitations of such software, the students might have assumed more thorough plagiarism detection than the software could actually deliver. Think of it as the same principle upon which polygraph operators depend: a polygraph can detect lies because the subjects of the polygraph test believe that it can detect lies. Maybe part of how the announcement that plagiarism-detecting software actually works to discourage plagiarism is that the students imagine that the software will detect more plagiarism than it actually can.

The authors of this study do not argue that automated plagiarism-detection is the answer to the underlying problem of academic dishonesty. They write:

Although suffering consequences for plagiarism may deter students from doing so, we favor promoting academic integrity and honesty and teaching students how to avoid plagiarism over sole enforcement of strict rules and penalties. Clear rules, code of ethics at universities, and awareness of responsibility among students may significantly contribute to the reduction of academic misconduct. As the values adopted at university will likely be carried into future professional life, it is very important that faculty continue educating students on inappropriateness of plagiarism and create an environment where academic dishonesty will not be tolerated. (146)

I'm definitely on board with the spirit of this -- understanding how academic dishonesty harms the features of the educational experience and the learning community on which you depend is probably a more compelling reason not to cheat than fear of detection and punishment (especially given that some people seem not to think they'll get caught, or they enjoy the risk-taking that goes with the cheating). And practically, it strikes me that this approach is necessarily, unless we are willing to impose automated screenings for cheating in the professional sphere as well.

Indeed, we probably need to recognize that as powerful as plagiarism-detection software becomes, there are competing products designed to help students outwit the software that is scanning their papers. (Sadly, it doesn't seem like they do much in the way of training students to give proper citations to the sources of their words and ideas.) It's a technological arms race, which means that no computer program can fully replace the careful eye of a human being reading student papers.

But maybe if students believe the humans reading their essays have sufficiently powerful technology at their disposal, they'll decide bending the rules is too big a risk to take.

Postscript: I've just noticed an article in the Chronicle of Higher Education that raises related issues. I don't know if I'll discuss it in any detail here, but while I'm making up my mind you may be interested in reading it.

_________

Bilic-Zulle, L., Azman, J., Frkovic, V., & Petrovecki, M. (2008). Is There an Effective Approach to Deterring Students from Plagiarizing? Science and Engineering Ethics, 14 (1), 139-147 DOI: 10.1007/s11948-007-9037-2

More like this

On the cultural note:
"Plagiarize
Let no one else's work evade your eyes
Remember why the good Lord made your eyes
So don't shade your eyes
But plagiarize, plagiarize, plagiarize
Only be sure always to call it please "research"
And ever since I meet this man
My life is not the same
And Nicolai Ivanovich Lobachevsky is his name, hi!
Nicolai Ivanovich Lobache-
"
Those darn dirty commies, always sharing things. Like ideas.

Plagiarizing from a set text is nothing: I've come across students plagiarizing books written by the lecturer before.

Our institution uses Turnitin, which scans students' papers for evidence of copying from anything on the internet and or on any other docs it can get its hands on. The best way I've found to use it is as a tool for the students, rather than as a tool I use to catch them cheating.

I tell the students that all scholars need to learn how to give other authors fair credit for their words and ideas, that Turnitin can help them, and that I will be using Turnitin to check what they've written. They then submit drafts of their papers to Turnitin, look at the report, then go back and rewrite or explicitly quote anything Turnitin flags. I see very little plagiarism in the final reports.

There are students who are just not attending for the "educational experience"; they are attending for the degree. The easiest way to get that degree seems right and fine to them. Essays are not a way to stretch their wings and thinking paradigms, essays are a way to ruin an evening. These folks are for whom Turnitin was created.

Then there are students who aren't primary language speakers of the language used at the university, and see something online or in another document that says what they want to say far better than they think they could ever express it -- so they copy it without attribution. That tends to be an easy catch of plagiarism for the professor reading the essays. These students have a much harder time understanding the whole plagiarism concept. They found what they wanted to say written for them in good [primary language]. Why shouldn't they use it?

This is an interesting study and topic. As teacher and former student who remembers what is was like to be a student, I think it would be fascinating to know why plagiarism happens.

Sometimes I feel instructors are all too quick to lump students together. Unfortunately, students are exposed to many different instructors. This can be a very good thing to some degree and it can also be difficult for students too. Different teachers have different teaching and grading styles. Though the syllabus and rubric are often used as a means to lay out what is expected of students, there is a great degree of subjectivity and bias involved in grading. Sometimes instructors are more interested (for various reasons) in doling out grades too than ensuring and encouraging student learning. Does this justify plagiarism? Of course not. But, I think creating an atmosphere in the classroom (even in a large lecture hall) built on trust, or rather having students see that the teacher is a human being, as they are too, is a step in the right direction to discouraging cheating of any sort.

My own guess is that students and teachers have very different outlooks on what they hope to achieve and this affects plagiarism.

Teachers must have a baseline knowledge of students' extemporaneous writing (from classroom tests) to compare to work done outside of class. A student who does C work in class, and A work outside, should be scrutinized. It is seldom practical to find source material which can be used (plagiarized) 100% verbatim. Plagiarists often do 'patchwork' that does not fit due to their poorer writing than that in the plagiarized material. Lazy people are seldom practiced and competent writers.

Even primary English speakers who are not motivated by a degree will plagiarize for a variety of reasons. I once became suspicious of a newspaper editorial by a regular contributor. It had 'patchwork'. The editor told me that she also had another letter from the contributor which was pending publication. It never appeared, and the contributor has not had anything else published.

A question about false positives: Of the papers that were flagged as plagiarized, how many were flagged because they quoted the source article with proper attribution? This is why a positive result from plagiarism detection software should not by itself be considered proof of plagiarism. It merely flags certain passages for closer inspection; the instructor then has to decide whether the passage actually does constitute plagiarism.

I speak as somebody who works in a field where there are a handful of instruments collecting data. Every paper that uses such data has to describe what the instrument they are using does, and there are only so many ways to do so. Such boilerplate text would also be an example of a false positive when flagged by plagiarism detection software.

Eric Lund @7, I think this bit of the article addresses your worry:

All essays were manually (visually) controlled by the investigators to ensure that properly quoted text was not counted as plagiarized text. However, no properly referenced direct quote was found in any of student essays. (142)

I'm assuming the humans involved in double-checking the rulings of the software were attentive to issues like boilerplate text.

Thanks for the discussion and pointer to this article!

I have been testing "plagiarism" detection software since 2004. The problem is that the software tends to only find copies, not plagiarisms. Simple manipulation of text is often enough to fool the system.

WCopyFind is very good at exactly this task: comparing a cohort of papers for collusion or against a known source.

I have tested turnitin three times - it just does not deal with plagiarism in languages that have characters not in the ASCII set. I will be testing it this year with English-language test cases. Please note that turnitin has had issues in the past with not finding the Wikipedia as a source (!) and there is the problem of copyright. Turnitin keeps a copy of the student's paper, which is in violation of EU copyright (author's right) law.

The current trend seems to be for students to purchase papers ("plagiarism free!") from ghostwriters. We must educate our students on why we have an issue with plagiarism and other scientific misconduct types. And we must do it the first semester, and repeat often.

I used to think that students who plagiarized were just lazy. Now I think that is only partially true. Mostly I think they don't know how to write. American high schools still graduate large numbers of students who cannot really put together a simple essay. I know this, because many of them end up at state universities. The universities in turn let the students slide by with a 'pass' in writing 121 or the equivalent. I suspect that we see quite a bit less plagiarism if we could improve the writing skills of our students. Personally, I would like my university to implement a policy of stricter standards for passing out of 121 or the equivalent. Students who do not demonstrate real competence would simply remain in the course until they do. Make 121 a prereq for any course that involves a paper, and perhaps we see a lot less cheating since the writing is easier for them. I know it's only a theory, but at least it's testable...

Just the tip of the iceberg..
I know of fellow students using internet through i-phones on exams. It shouldn't, but does make me feel kind of stupid when merely scoring a B for the real test, while others are graded A+ for nothing more than their ability to quick search the internet and find right answers to posed questions..
I always told myself they're only cheating themselves, but this surely isn't true; in fact they cheat everyone by making a total fallacy of the educational system. Eventually degrading diploma's and science as a whole?

My impression is that even in the case where the researchers claim to be testing the role of punishment, the punishment is very light: rewrite the paper. I wonder what the outcome would have been if they were told that the punishment for plaigiarism was expulsion from medical school?

"...the punishment for plaigiarism" should be "expulsion from medical school."

When I was a grad student, we caught a bunch of undergrads cheating on a computer assignment (handing in others work - often with the other student's names still in the comments section). The professor wanted to give the students a zero for the assignment, but I and the other TAs insisted that some stronger punishment was warranted. (We felt it should be better to not turn in an assignment than to cheat.) The final decision was that the students lost one entire letter grade from the class. (So if they earned a B, they got a C.)

More like this

Another turning point, a fork stuck in the road.

Friday Sprog Blogging: waking up.

Research methods and primary literature.

Friday Sprog Blogging: climate change and ecosystems.

Americans for Medical Progress names two Hayre Fellows in Public Outreach.

As the Oil Spreads, So Does the Blame

Weekend Diversion: Building Down (Synopsis)

Forget Shooting Stars; How About a Shooting Galaxy!