Advanced Machine Learning for Big Data and Text Processing
Machine learning methods drive much of modern data analysis across engineering, science, and commercial applications. For example, search engines, recommender systems, advertisers, and financial institutions employ machine learning algorithms for content recommendation, predicting customer behavior, compliance, or risk.
This course looks at how the latest tools, techniques, and algorithms driving modern and predictive analysis can be applied in different fields, even when using unstructured data. You'll gain insights about the underlying tools, what kinds of problems they can/cannot solve, how they can be applied effectively, and what issues are likely to arise in practical applications, particularly in the healthcare field.
- Understand broad opportunities for automation with machine learning
- Outline key aspects of practical problems that are likely to impact performance
- Explore modern natural language processing tools, formulations, and problems
- Be able to discuss scaling issues (amount of data, dimensionality, storage, and computation)
- See through the process of applying machine learning methods in practice, foresee likely hurdles and possible remedies
- Grasp what predictive analytics often does not provide
- Understand current machine learning trends and opportunities that they bring
Day 1: (6h)
- Recommender systems (2h)
- Unsupervised learning: mixtures, EM (2h)
- Markov models, recurrent neural networks (2h)
Day 2: (6h)
- Reinforcement learning (2h)
- Deep RL (1h)
- Intro to NLP problems (1h)
- Learning lexical representations (1h)
- Extraction, annotation, parsing (1h)
Day 3: (6h)
- Advanced NLP applications: machine translation, dialogue systems (2h)
- ML for medical applications and drug design (2h)
- Participant problems, solicited in advance (2h)
Who should attend
This course is designed for people with working knowledge and experience with machine learning. Those who attend should have a basic understanding of the essential mathematical concepts and theories used in the field. The course assumes an undergraduate degree in computer science or another technical area such as statistics, physics, electrical engineering, etc., with exposure to vectors and matrices, basic concepts of probability. A high-level understanding of programming (thinking in terms of programs) is also beneficial.
- For professionals whose work involves data hands-on, the course aims to provide a deeper understanding and sharper intuitions about what is possible, what is not, and which methods to consider in what contexts.
- At the managerial level, the course provides the vision and understanding of the many opportunities, costs, and likely performance hurdles in predictive modeling, especially as they pertain to large amounts of textual (or similar) data.