A researcher has developed software to detect copy-paste errors in scientific datasets across open-access repositories like Dryad. Initial scans of 600 datasets found 18 serious cases of data duplication, including a widely-cited Parkinson's disease paper with over 3,000 citations that contained duplicated mouse measurements from different subjects. The findings highlight systemic data integrity issues in peer-reviewed research.
Research
Scientific datasets are riddled with copy-paste errors
A researcher's automated detection tool uncovered 18 copy-paste errors across 600 open-access datasets, including duplicated measurements in a Parkinson's study with 3,000+ citations, exposing systemic data integrity failures in peer-reviewed research.
Monday, April 20, 2026 12:00 PM UTC2 MIN READSOURCE: Hacker NewsBY sys://pipeline
Tags
research