Never realized you could detect fraudulent data using Cronbach´s Alpha.
Dutch newspaper ¨De Volkskrant¨ published an interview this weekend on the large scale fraud by a Dutch professor in social psychology. Three of his PhD candidates discovered the fraud, using Cronbach´s Alpha to show that the reported scores on some scale were unrealisitically unreliable, while showing clear patterns in support of the hypothesis.
The exact argument is not provided in the newspaper, but I would assume it would be something like the likelihood of such an unreliable scale providing such clear patterns in the means structure of the data is very low, assuming the measurements were derived from a natural (=real, non-fraudulent) experimental setting. Sure, that is no proof of fraud, but it is where the seed of doubt was planted. The publicly available version of the article is found here (in Dutch), and newspaper subscribers have access to the much longer interview with the three PhD candidates that uncovered the fraud.
Ironic that a measure developed to calculate reliability, was used to detect fraud.
This reminded me of a series on blog posts Andrew Gelman ran a few years ago on using statistics to detect elections fraud (e.g. here, here, and more recently here).
Update: The newspaper of my university (UT News) also runs a story on this issue.
Actually, I suspect they ran a statistical analysis on the actual digits that showed up. At least, that’s what forensic accountants and suchlike normally do. That’s how they proved that Greece was cooking the books, and that Belgium was the most likely other Eurozone country to be doing the same.
http://economicsintelligence.com/2011/07/28/how-an-arcane-statistical-law-could-have-prevented-the-greek-disaster/
Hi Martin,
the use of Cronbach’s Alpha was discussed specifically in this interview, but indeed: the techniques you mention are commonly used. Specific patterns emerge in fake data, while these patterns are highly unlikely in ‘real’ data.
Even more: in the case of the study on ‘meat-eating bastards’ (the data of which were ‘obtained’ by the same fraudulent social-psychologist) was uncovered to be fake with a very similar technique. Much more simple, though: a statistician noticed in the presented table that the marginal distribution of observations was logically impossible given the small number of observations.