Ferret: Reviewing Tabular Datasets for Manipulation

Loading...
Thumbnail Image
Date
2023
Journal Title
Journal ISSN
Volume Title
Publisher
The Eurographics Association and John Wiley & Sons Ltd.
Abstract
How do we ensure the veracity of science? The act of manipulating or fabricating scientifc data has led to many high-profle fraud cases and retractions. Detecting manipulated data, however, is a challenging and time-consuming endeavor. Automated detection methods are limited due to the diversity of data types and manipulation techniques. Furthermore, patterns automatically fagged as suspicious can have reasonable explanations. Instead, we propose a nuanced approach where experts analyze tabular datasets, e.g., as part of the peer-review process, using a guided, interactive visualization approach. In this paper, we present an analysis of how manipulated datasets are created and the artifacts these techniques generate. Based on these fndings, we propose a suite of visualization methods to surface potential irregularities. We have implemented these methods in Ferret, a visualization tool for data forensics work. Ferret makes potential data issues salient and provides guidance on spotting signs of tampering and differentiating them from truthful data.
Description

CCS Concepts: Human-centered computing -> Information visualization; Human computer interaction (HCI)

        
@article{
10.1111:cgf.14822
, journal = {Computer Graphics Forum}, title = {{
Ferret: Reviewing Tabular Datasets for Manipulation
}}, author = {
Lange, Devin
and
Sahai, Shaurya
and
Phillips, Jeff M.
and
Lex, Alexander
}, year = {
2023
}, publisher = {
The Eurographics Association and John Wiley & Sons Ltd.
}, ISSN = {
1467-8659
}, DOI = {
10.1111/cgf.14822
} }
Citation
Collections