Data and Models: Regression Analytics
This course aims to teach a suite of algorithms and concepts to a diverse set of participants interested in the general concept of fitting data to models. It starts with mostly simple linear algebra and computational methods, and introduces some more difficult mathematical concepts towards the end. This method also, by design, fits in with our approach of morning lectures and afternoon practice on personal computers. The combined teaching system provides opportunities for much hands-on learning and participants leave the course with practical knowledge of the basic algorithms.
The course is very broad and is primarily intended to cover the fundamentals of each technique we address. Consequently, the major gain is that we can cover many different approaches. Think of it this way: we cover the first chapter or two of a specialized "book" on a given method. We therefore get you through the many fundamentals, which then allow you to dig further through the book on your own. Another way of thinking of our approach is the analogy of a carpenter’s tools—the goal is for participants to understand the utility of each tool and not to become specialists in any one method. In that sense the course is introductory and general.
The course taps into material from a very wide selection of literature in many disciplines involving computation, including but not limited to: statistics and applied mathematics, science, engineering, medicine and biomedicine, computer science, geosciences, system engineering, economics, insurance, finance, business, and aerospace engineering. More specific areas in which you might come across relevant books are: Regression, non-linear regression, linear and non-linear parameter estimation, inversion, system identification, econometrics, biometrics, etc. The diversity of the past participants and their fields has always provided many perspectives on our common interest in data and models. Please note that we do not specifically cover non-parametric statistics, principal component analysis, or Big Data.
You will be able to take the afternoon lab exercises along with you as executables so you can practice the course material at a later time. These algorithms are not intended as a stand-alone package to be used later in regression applications; they are simply given to participants to aid in the course instruction.
Laptops for which you have administrative privileges are required for this course. PCs are recommended. Tablets will not be sufficient for the computing activities performed in this course.
Takeaways from this course include:
- Examining how to fit data to models
- Defining linear least squares, non-linear least squares, singular value decomposition, sensitivity analysis, experiment design, and parameter error estimation
- Appreciating grid search, random search, simulated annealing, genetic algorithms, neural networks, and large inverse systems
- Investigating principles leading to rapid application of methods
- Evaluating the results of pre-programmed computer exercises
The format of each day is generally the same: mornings are devoted to lectures while participants spend the afternoons running pre-programmed software based on the morning lectures. During the afternoons, we stop the class often to have a discussion of progress and to give helpful tips and suggestions. Participants can work singly or in pairs at the computer.
Individual lectures will address the following topics:
- Philosophy of Data and Models
- Straight Line Data Analysis
- Least Squares
- Levenberg-Marquardt and Ridge Regression Algorithms
- Damped Least Squares Comparison
- Stochastic Inverse
- Singular Value Decomposition
- Random and Grid-Search Methods
- Simulated Annealing and Genetic Algorithms
- Neural Networks
- Parameter Error Estimates
- Large Inverse Problems
- Experimental Design
Note that the order of the lectures can vary from that given above. A bound copy (and an electronic version) of all PowerPoint lecture notes is given to each participant, to follow lectures and make notes.
Who should attend
This course is ideal for anyone who fits data to models. This course is truly broad-based and participants from vastly differing fields are envisioned and encouraged to attend. Some of these fields are engineering, business, natural sciences, geoscience, medicine, statistics, and economics.
Familiarity with computing and statistics is desirable. A fair background in linear algebra is highly recommended. The course is a condensed version of a regular MIT class with the same title, taught by Professor Morgan. The course has also been given at NASA, the University of the West Indies in Barbados, Sakarya University in Turkey, Stanford University, University of Science and Technology of China,the Cyprus Institute, and Texas A&M University.
Recent and past participants in this course have come from: Air Force Office of Scientific Research (AFOSR), Amgen Inc., AT&T, BAE Systems, Bank of America, Boeing, Boehringer Ingelheim Pharmaceuticals, BP America, Cox Communications, Delphi, Dupont, Environmental Protection Agency, ExxonMobil Chemical, General Motors, Hitachi (Japan), Intel, Johnson & Johnson, Korea Power Co., Kraft Foods, Los Alamos Labs, Mathworks, Mayo Clinic, Merck & Co Inc, Motorola, Naval Research Laboratory, NTT (Japan), Nokia Research Center, Phillips Exeter Academy, Pioneer Investments, Polaroid Corporation, Sandia National Labs, Saudi Arabian Monetary Agency, University of Pennsylvania, University of West Indies, and the U.S. Air Force.