Data+Shift: Supporting Visual Investigation of Data Distribution Shifts by Data Scientists

dc.contributor.authorPalmeiro, Joãoen_US
dc.contributor.authorMalveiro, Beatrizen_US
dc.contributor.authorCosta, Ritaen_US
dc.contributor.authorPolido, Daviden_US
dc.contributor.authorMoreira, Ricardoen_US
dc.contributor.authorBizarro, Pedroen_US
dc.contributor.editorAgus, Marcoen_US
dc.contributor.editorAigner, Wolfgangen_US
dc.contributor.editorHoellt, Thomasen_US
dc.date.accessioned2022-06-02T15:50:46Z
dc.date.available2022-06-02T15:50:46Z
dc.date.issued2022
dc.description.abstractMachine learning on data streams is increasingly more present in multiple domains. However, there is often data distribution shift that can lead machine learning models to make incorrect decisions. While there are automatic methods to detect when drift is happening, human analysis, often by data scientists, is essential to diagnose the causes of the problem and adjust the system. We propose Data+Shift, a visual analytics tool to support data scientists in the task of investigating the underlying factors of shift in data features in the context of fraud detection. Design requirements were derived from interviews with data scientists. Data+Shift is integrated with JupyterLab and can be used alongside other data science tools. We validated our approach with a think-aloud experiment where a data scientist used the tool for a fraud detection use case.en_US
dc.description.sectionheadersVisual Analysis and Machine Learning
dc.description.seriesinformationEuroVis 2022 - Short Papers
dc.identifier.doi10.2312/evs.20221097
dc.identifier.isbn978-3-03868-184-7
dc.identifier.pages79-83
dc.identifier.pages5 pages
dc.identifier.urihttps://doi.org/10.2312/evs.20221097
dc.identifier.urihttps://diglib.eg.org:443/handle/10.2312/evs20221097
dc.publisherThe Eurographics Associationen_US
dc.rightsAttribution 4.0 International License
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectCCS Concepts: Human-centered computing --> Visualization systems and tools; Computing methodologies --> Machine learning
dc.subjectHuman centered computing
dc.subjectVisualization systems and tools
dc.subjectComputing methodologies
dc.subjectMachine learning
dc.titleData+Shift: Supporting Visual Investigation of Data Distribution Shifts by Data Scientistsen_US
Files
Collections