XBUS-506 Visual Analytics
Though visual representations of quantitative information were traditionally cast as the end phase of the data analysis pipeline, visualizations can play important roles throughout the analytic process and are critical to the work of the data scientist. Where static outputs and tabular data may render patterns opaque, human visual analysis can uncover volumes and lead to more robust programming and better data products. For students getting started with data science, visual diagnostics are particularly important for effective machine learning. When all it takes is few lines of Python to instantiate and fit a predictive model, visual analysis can help navigate the feature selection process, build intuition around model selection, identify common pitfalls like local minima and overfit, and support hyperparameter tuning to render more successful predictive models. In this course, students will learn to deploy a suite of visual tools using Scikit-Learn, Matplotlib, Pandas, Bokeh, and Seaborn to augment the analytic process and support machine learning from preliminary feature analysis through model selection, evaluation, and tuning.
Upon successful completion of the course, students will be able to use visualizations to:
- Summarize and analyze a range of data sets.
- Support feature engineering and feature selection.
- Diagnose common machine learning problems like bias, heteroscedasticity, underfit, and overtraining.
- Evaluate their machine learning models' performance, stability, and predictive value.
- Steer their predictive models toward more successful results.