Who should attend
- Data analysts and business analysts
- Database managers
- Technical and systems analysts
- Programmers interested in data science
- Business managers
About the course
Data science is one of today’s most in-demand functions — and Python is an essential skill in any data scientist's toolbox. In this program, you will master the ability to analyze and visualize data in meaningful ways using Python to help solve complex business problems. Working with tools such as Jupyter Notebooks, NumPy, and Pandas, you will have the opportunity to analyze real-world datasets to identify patterns and relationships in data. You will gain experience using both built-in and custom-built data types to create expressive and computationally robust data science projects. Finally, you will build predictive machine learning models using Python and scikit-learn.
To be successful in this program, it is recommended that students have some experience in analytics and programming, specifically with creating visualizations in spreadsheets.
The amount of time you spend on these courses will depend on your prior experience. Since these courses are designed for someone with limited exposure to programming, you can expect them to start off with the foundations and then quickly move into more advanced and complex topics.
Constructing Expressions in Python
Expressions are a core attribute of any Python program. In this course, you will construct expressions and reuse them to manipulate and compute variables in a variety of applications. This reusability enables a "create once, use everywhere" development paradigm which will streamline development of your current and future Python programs. You will develop the knowledge and skills to assign and access variables, combine variables and data in expressions, and leverage Python as a powerful calculator. You'll also use the enhanced capabilities of the IPython environment to do interactive work with Python and to explore your data through new analyses. The knowledge and skills you gain will help you construct Python expressions to streamline the development of your current and future Python data science projects.
Writing Custom Python Functions, Classes, and Workflows
This course introduces you to the different scenarios in which you will utilize built-in Python functions, classes, and data types as opposed to creating your own or using a combination of built-in and custom-built capabilities. You will gain experience working with both built-in and custom-built functions, classes, and data types. Through practice and application of these basic building blocks/tools, you will gain an in-depth understanding of how these aspects of Python interoperate to create useful programs.
Developing Data Science Applications
Python is much more than a programming language. In this course, you will leverage the comprehensive Python ecosystem of libraries, frameworks, and tools to develop complex data science applications. Throughout this course, you will practice using the different Python tools appropriate to your dataset. You will leverage library resources for data acquisition and analysis as well as machine learning. Dataframes will be introduced as a means of manipulating structured data tables for advanced analysis. Additionally, you will practice basic routines for data visualization utilizing Jupyter Notebooks.
Creating Data Arrays and Tables in Python
Decision-makers generally do not use raw data to make decisions; they prefer data be summarized in easily understood formats that facilitate efficient decision-making. This course introduces data manipulation and visualization, both critical components of any data science project. This course introduces two commonly used data manipulation tools in the Python ecosystem: NumPy and Pandas. In addition, the Python ecosystem also includes a variety of data plotting packages such as Matplotlib, Seaborn, and Bokeh — each of which specialize in particular aspects of data visualization. This course will give you experience integrating NumPy, Pandas, and the plotting packages to create rich, interactive data visualizations that help drive efficient decision-making.
Organizing Data with Python
Most data science projects that use Python will require you to access and integrate different types of data from a variety of external sources. This course will give you experience identifying and integrating data from spreadsheets, text files, websites, and databases. To prepare for downstream analyses, you first need to integrate any external data sources into your Python program. You will utilize existing packages and develop your own code to read data from a variety of sources. You will also practice using Python to prepare disorganized, unstructured, or unwieldy datasets for analysis by other stakeholders.
Analyzing and Visualizing Data with Python
In order to be useful within a professional environment, data must be structured in a way that can be understood and applied to real-world scenarios. This course introduces using Python to perform statistical data analysis and create visualizations that uncover patterns in your data. Using the tools and workflows you developed in earlier courses, you will carry out analyses on real-world datasets to become familiar with recognizing and utilizing patterns. Finally, you will form and test hypotheses about your data which will become the foundation upon which data-driven decision-making is built.
Building Predictive Machine Learning Models
In this course, you will explore some of the machine learning tools you can use to magnify the analytical power of Python data science programs. You will use the scikit-learn package — a Python package developed for machine learning applications — to develop predictive machine learning models. You will then practice using these models to discover new relationships and patterns in your data. These capabilities allow you to unlock additional value in your data that will aid in making predictions and, in some cases, creating new data.
KEY COURSE TAKEAWAYS
- Visualize data with Python
- Write custom functions and data classes in Python that can be stored for reuse
- Use key elements of Python control flow and iteration
- Use Jupyter Notebooks to integrate data analysis, visualization, and documentation
- Manipulate data arrays and tables using NumPy and Pandas
- Filter, integrate, and prepare data for analysis
- Perform statistical data analysis and visualization
- Explore datasets with machine learning
Professor Myers has been with CAC since 2017, having previously been a member of the research staff of the Bioinformatics Facility of the Institute of Biotechnology (2007-2017) and the Cornell Theory Center (1993-1997, 1998-2007). In addition, Professor Myers is an Adjunct Professor in the Depart...
Videos and materials
Because of COVID-19, many providers are cancelling or postponing in-person programs or providing online participation options.
We are happy to help you find a suitable online alternative.