A scientific approach designed to precisely calibrate the metrics needed for quantifying bullshit

Science News reports: Dutch social psychologist Diederik Stapel was known for his meteoric rise, until he was known for his fall. His research on social interactions, which spanned topics from infidelity to selfishness to discrimination, frequently appeared in top-tier journals. But then in 2011, three junior researchers raised concerns that Stapel was fabricating data. Stapel’s institution, Tilburg University, suspended him and launched a formal investigation. A commission ultimately determined that of his more than 125 research papers, at least 55 were based on fraudulent data. Stapel now has 57 retractions to his name.

The case provided an unusual opportunity for exploring the language of deception: One set of Stapel’s papers that discussed faked data and a set of his papers based on legitimate results. Linguists David Markowitz and Jeffrey Hancock ran an analysis of articles in each set that listed Stapel as the first author. The researchers discovered particular tells in the language that allowed them to peg the fraudulent work with roughly 70 percent accuracy. While Stapel was careful to concoct data that appeared to be reasonable, he oversold his false goods, using, for example, more science-related terms and more amplifying terms, like extreme and exceptionally, in the now-retracted papers.

Markowitz and Hancock, now at Stanford, are still probing the language of lies, and they recently ran a similar analysis on a larger sample of papers with fudged data.

The bottom line: Fraudulent papers were full of jargon, harder to read, and bloated with references. This parsing-of-language approach, which the team describes in the Journal of Language and Social Psychology, might be used to flag papers that deserve extra scrutiny. But tricks for detecting counterfeit data are unlikely to thwart the murkier problem of questionable research practices or the general lack of clarity in the scientific literature.

“This is an important contribution to the discussion of quality control in research,”Nick Steneck, a science historian at the University of Michigan and an expert in research integrity practices, told me. “But there’s a whole lot of other reasons why clarity and readability of scientific writing matters, including making things understandable to the public.” [Continue reading…]

War in Context

… with attention to the unseen

A scientific approach designed to precisely calibrate the metrics needed for quantifying bullshit