Ferret: Reviewing Tabular Datasets for Manipulation
Loading...
Date
2023
Journal Title
Journal ISSN
Volume Title
Publisher
The Eurographics Association and John Wiley & Sons Ltd.
Abstract
How do we ensure the veracity of science? The act of manipulating or fabricating scientifc data has led to many high-profle fraud cases and retractions. Detecting manipulated data, however, is a challenging and time-consuming endeavor. Automated detection methods are limited due to the diversity of data types and manipulation techniques. Furthermore, patterns automatically fagged as suspicious can have reasonable explanations. Instead, we propose a nuanced approach where experts analyze tabular datasets, e.g., as part of the peer-review process, using a guided, interactive visualization approach. In this paper, we present an analysis of how manipulated datasets are created and the artifacts these techniques generate. Based on these fndings, we propose a suite of visualization methods to surface potential irregularities. We have implemented these methods in Ferret, a visualization tool for data forensics work. Ferret makes potential data issues salient and provides guidance on spotting signs of tampering and differentiating them from truthful data.
Description
CCS Concepts: Human-centered computing -> Information visualization; Human computer interaction (HCI)
@article{10.1111:cgf.14822,
journal = {Computer Graphics Forum},
title = {{Ferret: Reviewing Tabular Datasets for Manipulation}},
author = {Lange, Devin and Sahai, Shaurya and Phillips, Jeff M. and Lex, Alexander},
year = {2023},
publisher = {The Eurographics Association and John Wiley & Sons Ltd.},
ISSN = {1467-8659},
DOI = {10.1111/cgf.14822}
}