About the course
Though visual representations of quantitative information were traditionally cast as the end phase of the data analysis pipeline, visualizations can play important roles throughout the analytic process and are critical to the work of the data scientist. Where static outputs and tabular data may render patterns opaque, human visual analysis can uncover volumes and lead to more robust programming and better data products. For students getting started with data science, visual diagnostics are particularly important for effective machine learning. When all it takes is few lines of Python to instantiate and fit a predictive model, visual analysis can help navigate the feature selection process, build intuition around model selection, identify common pitfalls like local minima and overfit, and support hyperparameter tuning to render more successful predictive models.
In this course, students will learn to deploy a suite of visual tools using Scikit-Learn, Matplotlib, Pandas, Bokeh, and Seaborn to augment the analytic process and support machine learning from preliminary feature analysis through model selection, evaluation, and tuning.
Upon successful completion of the course, students will be able to use visualizations to:
- Summarize and analyze a range of data sets.
- Support feature engineering and feature selection.
- Diagnose common machine learning problems like bias, heteroscedasticity, underfit, and overtraining.
- Evaluate their machine learning models' performance, stability, and predictive value.
- Steer their predictive models toward more successful results.
Rebecca Bilbro is an adjunct faculty member in Georgetown University's Data Science Certificate Program, where she teaches Visual Analytics. Dr. Bilbro earned her doctorate from the University of Illinois, Urbana-Champaign, where her research centered on communication and visualization practices ...
Benjamin Bengfort is an experienced data scientist and software engineer who focuses on implementing data products that can learn from real-time streaming data. Benjamin is the program director of the Georgetown Data Science Certificate program where he also teaches Machine Learning. He is also ...
Because of COVID-19, many providers are cancelling or postponing in-person programs or providing online participation options.
We are happy to help you find a suitable online alternative.