Scientific Papers—and the citations accompanying them—were originally intended to disseminate knowledge. Increasingly, however, they are being treated as (ac)counting units for evaluation. Information systems that manage scientific publications can compute a wide range of metrics and rankings, which now shape the core logic of publishing activity. In this talk, we will see that this has led to new types of bizarre artifacts that can be automatically detected: meaningless publications, tortured phrases, irrelevant or sneaked references, etc.
Uncovering and analyzing these practices forms part of the ERC Synergy project NanoBubbles, dedicated to understanding “how, when, and why science fails to self-correct.” Within this project, several actions of research focus on the automatic analysis of scientific articles: detecting inappropriate citations; examining the impact of retracted papers on claims and reasoning in citing articles; detection of new tortured phrases and more.

