Georgia Tech Professional Education

Practical Data Science and Machine Learning for Engineers

Available dates

This course has no confirmed dates in the future. Subscribe to be notified when it is offered.

About the course

With the growing importance of data and data processing across all industries, it is critical for modern engineers to be nimble data scientists. For engineers who are not professional software developers, it can be tricky to break into the ecosystem of modern tooling that is required to efficiently process and learn from data. The focus of this course is to introduce the tools, theory, and methods for working with applied data science and machine learning (DS/ML). You will learn how to use and interact with open source DS/ML tools, the theory behind canonical ML algorithms, and practical methods and workflows for learning from data. This class is project based, and you will be guided through a series of practical data science problems. In addition, you will learn the DevOps skills required to be productive as a DS/ML engineer.

What You Will Learn

  • Modern developer tools
  • Data science workflow
  • Data science with Python in a browser
  • Manipulating data in Python
  • Visualization stories with data
  • Machine learning theory
  • Machine learning workflow and pipelines
  • How to deal with bigger data

How You Will Benefit

  • Implement fully functioning machine learning pipelines.
  • Apply modern DS/ML toolchain to practical problems.
  • Become a proficient user of Python and Jupyter.
  • Implement data visualization for effective data storytelling.
  • Apply big data tools, such as Hadoop, Spark, and Amazon Web Services to data problems.



  • Efficiently integrate with Google, Amazon, and Azure Cloud products to create remote and repeatable computing environments
  • Linux command line fundamentals for installing and maintaining production enterprise DS/ML pipelines


  • How to install Jupyter and how to use it on remote machines
  • Deep dive into data manipulation with Pandas and Numpy
  • Data visualization theory and practice using Matplotlib and Seaborn


  • Introduction and review of ML theory and algorithms
  • Supervised learning, unsupervised learning, ensemble methods, boosting, deep neural networks


  • Classical ML techniques and conventions using Scikit-Learn
  • Deep neural network learning using Tensorflow and Keras


  • Processing big data with Hadoop and PySpark
  • Creating repeatable production batch job pipelines using Luigi

Who should attend

This course is designed for engineers, scientists, and managers from commercial industry, educational institutions, and government agencies. A core toolset will be covered that is relevant to skills across almost any modern industry, from science to advertising to industrial automation.

Trust the experts

Brian Beck

Brian Beck received the B.S. degree in electrical engineering and the M.B.A. degree from Ohio State University in 2007 and 2009, respectively. He received the Ph.D. degree in electrical and computer engineering in 2016 from the Georgia Institute of Technology. Brian currently works as a research ...


Vincent Emanuele

Dr. Vince Emanuele, has more than 12 years of experience researching and implementing machine learning systems in collaboration with interdisciplinary teams. Most recently, Vince was Head of Data Science at Wellcentive, a company focused on healthcare analytics. During his tenure at Wellcentive, ...


Course reviews