Exploratory Data Mining and Data Cleaning (1 ed.). In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology (UIST '18). Rousillon: Scraping Distributed Hierarchical Web Data. Chasins, Maria Mueller, and Rastislav Bodik. This work suggests that presenting readable code to professional data scientists is an indispensable component of offering data wrangling tools in notebooks. The synthesized code allowed data scientists to verify the intended data transformation, increased their trust and confidence in Wrex, and fit seamlessly within their cell-based notebook workflows. Qualitative participant feedback indicates that Wrex was useful and reduced barriers in having to recall or look up the usage of various data transform functions. User study results demonstrate that data scientists are significantly more effective and efficient at data wrangling with Wrex over manual programming. We propose a unified interaction model based on programming-by-example that generates readable code for a variety of useful data transformations, implemented as a Jupyter notebook extension called Wrex. Data wrangling is a difficult and time-consuming activity in computational notebooks, and existing wrangling tools do not fit the exploratory workflow for data scientists in these environments.
0 Comments
Leave a Reply. |