Dylan Small

Class of 1965 Wharton Professor of Statistics at The Wharton School

Schools

  • The Wharton School

Links

Biography

The Wharton School

Education

PhD, Stanford University, 2002
BA, Harvard University, 1997

Academic Positions Held

Wharton: 2002present

For more information, go to My Personal Page

Sameer Deshpande, Raiden Hasegawa, Amanda Rabinowitz, John Whyte, Carol Roan, Andrew Tabatabaei, Michael Baiocchi, Jason Karlawish, Christina Master, Dylan Small (2017), Association of Playing High School Football with Cognition and Mental Health Later in Life, JAMA Neurology.

Dylan Small and Paul R. Rosenbaum (2016), Constructed second control groups and attenuation of unmeasured biases, Journal of the American Statistical Association, 111, pp. 11571167. 10.1080/01621459.2015.1076342

Zijian Guo, Hyunseung Kang, Tony Cai, Dylan Small (Draft), Testing endogeneity with possibly invalid instruments and high dimensional covariates.

Edward Kennedy, Zongming Ma, Matthew McHugh, Dylan Small (2016), Nonparametric methods for doubly robust estimation of continuous treatment effects, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

Dylan Small and Paul R. Rosenbaum (2016), An exact test of fit for the Gaussian linear model using optimal nonbipartite matching, Technometrics, in press.

Hyunseung Kang, Anru Zhang, Tony Cai, Dylan Small (2016), Instrumental Variables Estimation With Some Invalid Instruments and its Application to Mendelian Randomization, Journal of the American Statistical Association, 111, pp. 132144.

Zijian Guo, Dylan Small, Stuart Gansky, Jing Cheng (Under Revision), Mediation analysis for count and zeroinflated count data without sequential ignorability.

Zijian Guo, Hyunseung Kang, Tony Cai, Dylan Small (Under Review), Confidence interval for causal effects with possibly invalid instruments even after controlling for many confounders.

Abstract: The instrumental variable (IV) method is commonly used to estimate the causal effect of a treatment on an outcome by using IVs that satisfy the assumptions of association with treatment, no direct effect on the outcome and ignorability. A major challenge in IV analysis is to find said IVs, but typically one is unsure of whether all of the putative IVs are in fact valid (i.e. satisfy the assumptions). We propose a general inference procedure that provides honest inference in the presence of invalid IVs, even after controlling for a large number of covariates. The key step of our method is a novel selection procedure, which we call TwoStage Hard Thresholding (TSHT), where we use hard thresholding to select the set of nonredundant instruments in the first stage and subsequently use hard thresholding to select the valid instruments in the second stage using the thresholding from the first stage. TSHT allows us to not only select invalid p IVs, but also provides honest confidence intervals of the treatment effect at $sqrt{n}$ rate. We establish asymptotic properties of our procedure and demonstrate that our procedure performs well in simulation studies compared to traditional IV methods, especially when the instruments are invalid.

Zijian Guo and Dylan Small (2016), Control function instrumental variable estimation of nonlinear causal effect models , Journal of Machine Learning Research.

Colin Fogarty, Michael Fay, Jennifer Flegg, Kasia Stepniewska, Rick Fairhurst, Dylan Small (2015), Bayesian Hierarchical Regression on Clearance Rates in the Presence of “Lag” and “Tail” Phases with an Application to Malaria Parasites, Biometrics, 71, pp. 751759.

Past Courses

STAT101 INTRO BUSINESS STAT

Data summaries and descriptive statistics; introduction to a statistical computer package; Probability: distributions, expectation, variance, covariance, portfolios, central limit theorem; statistical inference of univariate data; Statistical inference for bivariate data: inference for intrinsically linear simple regression models. This course will have a business focus, but is not inappropriate for students in the college.

STAT102 INTRO BUSINESS STAT

Continuation of STAT 101. A thorough treatment of multiple regression, model selection, analysis of variance, linear logistic regression; introduction to time series. Business applications.

STAT112 INTRODUCTORY STATISTICS

Further development of the material in STAT 111, in particular the analysis of variance, multiple regression, nonparametric procedures and the analysis of categorical data. Data analysis via statistical packages.

STAT475 SAMPLE SURVEY DESIGN

This course will cover the design and analysis of sample surveys. Topics include simple sampling, stratified sampling, cluster sampling, graphics, regression analysis using complex surveys and methods for handling nonresponse bias.

STAT510 PROBABILITY

Elements of matrix algebra. Discrete and continuous random variables and their distributions. Moments and moment generating functions. Joint distributions. Functions and transformations of random variables. Law of large numbers and the central limit theorem. Point estimation: sufficiency, maximum likelihood, minimum variance. Confidence intervals.

STAT512 MATHEMATICAL STATISTICS

An introduction to the mathematical theory of statistics. Estimation, with a focus on properties of sufficient statistics and maximum likelihood estimators. Hypothesis testing, with a focus on likelihood ratio tests and the consequent development of "t" tests and hypothesis tests in regression and ANOVA. Nonparametric procedures.

STAT520 APPLIED ECONOMETRICS I

This is a course in econometrics for graduate students. The goal is to prepare students for empirical research by studying econometric methodology and its theoretical foundations. Students taking the course should be familiar with elementary statistical methodology and basic linear algebra, and should have some programming experience. Topics include conditional expectation and linear projection, asymptotic statistical theory, ordinary least squares estimation, the bootstrap and jackknife, instrumental variables and twostage least squares, specification tests, systems of equations, generalized least squares, and introduction to use of linear panel data models.

STAT521 APPLIED ECONOMETRICS II

Topics include system estimation with instrumental variables, fixed effects and random effects estimation, Mestimation, nonlinear regression, quantile regression, maximum likelihood estimation, generalized method of moments estimation, minimum distance estimation, and binary and multinomial response models. Both theory and applications will be stressed.

STAT920 SAMPLE SURVEY METHODS

This course will cover the design and analysis of sample surveys. Topics include simple random sampling, stratified sampling, cluster sampling, graphics, regression analysis using complex surveys and methods for handling nonresponse bias.

STAT921 OBSERVATIONAL STUDIES

This course will cover statistical methods for the design and analysis of observational studies. Topics will include the potential outcomes framework for causal inference; randomized experiments; matching and propensity score methods for controlling confounding in observational studies; tests of hidden bias; sensitivity analysis; and instrumental variables.

STAT962 ADV METHODS APPLIED STAT

This course is designed for Ph.D. students in statistics and will cover various advanced methods and models that are useful in applied statistics. Topics for the course will include missing data, measurement error, nonlinear and generalized linear regression models, survival analysis, experimental design, longitudinal studies, building R packages and reproducible research.

STAT970 MATHEMATICAL STATISTICS

Decision theory and statistical optimality criteria, sufficiency, point estimation and hypothesis testing methods and theory.

STAT971 INTRO TO LINEAR STAT MOD

Theory of the Gaussian Linear Model, with applications to illustrate and complement the theory. Distribution theory of standard tests and estimates in multiple regression and ANOVA models. Model selection and its consequences. Random effects, Bayes, empirical Bayes and minimax estimation for such models. Generalized (Loglinear) models for specific nonGaussian settings.

STAT991 SEM IN ADV APPL OF STAT

This seminar will be taken by doctoral candidates after the completion of most of their coursework. Topics vary from year to year and are chosen from advance probability, statistical inference, robust methods, and decision theory with principal emphasis on applications.

Fellow, American Statistical Association, 2013

Knowledge @ Wharton

Are Your Customers ‘Clumpy’? What Bingebuying Means for Marketers, Knowledge @ Wharton 12/17/2014 Bound for the Beach? Bring These Books…, Knowledge @ Wharton 06/26/2012

Videos

Read about executive education

Other experts

Michael Pollitt

Research interests Industrial economics; privatisation and regulation of utilities especially in electricity; the measurement of productive efficiency; the relationship between Christian ethics and best practice business behaviour. Professional experience Professor Pollitt is a member of the edit...

Valérie Gauthier

Valérie Gauthier was educated in France and the United States, with a BS in Psychology (University Paris V), Masters in Modern Literature (University of Paris III) and English (University of Oklahoma), a PhD in Comparative Literature (Sorbonne) and Stanford’s Graduate Business School’s Executive ...

Looking for an expert?

Contact us and we'll find the best option for you.

Something went wrong. We're trying to fix this error.