Practical Data Science and Machine Learning for Engineers
With the growing importance of data and data processing across all industries, it is critical for modern engineers to be nimble data scientists. For engineers who are not professional software developers, it can be tricky to break into the ecosystem of modern tooling that is required to efficiently process and learn from data. The focus of this course is to introduce the tools, theory, and methods for working with applied data science and machine learning (DS/ML). You will learn how to use and interact with open source DS/ML tools, the theory behind canonical ML algorithms, and practical methods and workflows for learning from data. This class is project based, and you will be guided through a series of practical data science problems. In addition, you will learn the DevOps skills required to be productive as a DS/ML engineer.
What You Will Learn
- Modern developer tools
- Data science workflow
- Data science with Python in a browser
- Manipulating data in Python
- Visualization stories with data
- Machine learning theory
- Machine learning workflow and pipelines
- How to deal with bigger data
How You Will Benefit
- Implement fully functioning machine learning pipelines.
- Apply modern DS/ML toolchain to practical problems.
- Become a proficient user of Python and Jupyter.
- Implement data visualization for effective data storytelling.
- Apply big data tools, such as Hadoop, Spark, and Amazon Web Services to data problems.
DEVOPS FOR DATA SCIENCE
- Efficiently integrate with Google, Amazon, and Azure Cloud products to create remote and repeatable computing environments
- Linux command line fundamentals for installing and maintaining production enterprise DS/ML pipelines
DATA SCIENCE BASICS IN PYTHON
- How to install Jupyter and how to use it on remote machines
- Deep dive into data manipulation with Pandas and Numpy
- Data visualization theory and practice using Matplotlib and Seaborn
MACHINE LEARNING THEORY
- Introduction and review of ML theory and algorithms
- Supervised learning, unsupervised learning, ensemble methods, boosting, deep neural networks
MACHINE LEARNING PRACTICE
- Classical ML techniques and conventions using Scikit-Learn
- Deep neural network learning using Tensorflow and Keras
SCALING DATA PIPELINES
- Processing big data with Hadoop and PySpark
- Creating repeatable production batch job pipelines using Luigi
Who should attend
This course is designed for engineers, scientists, and managers from commercial industry, educational institutions, and government agencies. A core toolset will be covered that is relevant to skills across almost any modern industry, from science to advertising to industrial automation.