Test cases for plagiarism detection software

Debora Weber-Wulff

Abstract

There are numerous plagiarism detection software systems that claim to discover plagiarism of all sorts, given a digital text. This paper first discusses a typology of plagiarism, which makes clear that plagiarism is more than just an exact copy. Then a collection of 42 test cases in German are presented that were developed at the HTW Berlin for testing plagiarism detection software. The test cases have been used in three tests, in 2004, 2007, and 2008 and are available online. The test suite will be extended to include English-language test cases in 2010.

This paper was submitted to the International Integrity & Plagiarism Conference which ran between 2004-2014. The paper was peer reviewed by an independent editorial board and features in the conference proceedings.