Gives a comprehensive and systematic account of high-dimensional data analysis, including variable selection via regularization methods and sure independent feature screening methods. It is a valuable reference for researchers involved with model selection, variable selection, machine learning, and risk management.
Autorentext
The authors are international authorities and leaders on the presented topics. All are fellows of the Institute of Mathematical Statistics and the American Statistical Association.
Jianqing Fan is Frederick L. Moore Professor, Princeton University. He is co-editing Journal of Business and Economics Statistics and was the co-editor of The Annals of Statistics, Probability Theory and Related Fields, and Journal of Econometrics and has been recognized by the 2000 COPSS Presidents' Award, AAAS Fellow, Guggenheim Fellow, Guy medal in silver, Noether Senior Scholar Award, and Academician of Academia Sinica.
Runze Li is Elberly family chair professor and AAAS fellow, Pennsylvania State University, and was co-editor of The Annals of Statistics.
Cun-Hui Zhang is distinguished professor, Rutgers University and was co-editor of Statistical Science.
Hui Zou is professor, University of Minnesota and was action editor of Journal of Machine Learning Research.
Klappentext
Statistical Foundations of Data Science gives a thorough introduction to commonly used statistical models, contemporary statistical machine learning techniques and algorithms, along with their mathematical insights and statistical theories. It aims to serve as a graduate-level textbook and a research monograph on high-dimensional statistics, sparsity and covariance learning, machine learning, and statistical inference. It includes ample exercises that involve both theoretical studies as well as empirical applications.
The book begins with an introduction to the stylized features of big data and their impacts on statistical analysis. It then introduces multiple linear regression and expands the techniques of model building via nonparametric regression and kernel tricks. It provides a comprehensive account on sparsity explorations and model selections for multiple regression, generalized linear models, quantile regression, robust regression, hazards regression, among others. High-dimensional inference is also thoroughly addressed and so is feature screening. The book also provides a comprehensive account on high-dimensional covariance estimation, learning latent factors and hidden structures, as well as their applications to statistical estimation, inference, prediction and machine learning problems. It also introduces thoroughly statistical machine learning theory and methods for classification, clustering, and prediction. These include CART, random forests, boosting, support vector machines, clustering algorithms, sparse PCA, and deep learning.
Inhalt
I. Introduction
Rise of Big Data and Dimensionality
Biological Sciences
Health Sciences
Computer and Information Sciences
Economics and Finance
Business and Program Evaluation
Earth Sciences and Astronomy
Impact of Big Data
Impact of Dimensionality
Computation
Noise Accumulation
Spurious Correlation
Statistical theory
Aim of High-dimensional Statistical Learning
What big data can do
Scope of the book
2. Multiple and Nonparametric Regression
Introduction
Multiple Linear Regression
The Gauss-Markov Theorem
Statistical Tests
Weighted Least-Squares
Box-Cox Transformation
Model Building and Basis Expansions
Polynomial Regression
Spline Regression
Multiple Covariates
Ridge Regression
Bias-Variance Tradeo
Penalized Least Squares
Bayesian Interpretation
Ridge Regression Solution Path
Kernel Ridge Regression
Regression in Reproducing Kernel Hilbert Space
Leave-one-out and Generalized Cross-validation
Exercises
3. Introduction to Penalized Least-Squares
Classical Variable Selection Criteria
Subset selection
Relation with penalized regression
Selection of regularization parameters
Folded-concave Penalized Least Squares
Orthonormal designs
Penalty functions
Thresholding by SCAD and MCP
Risk properties
Characterization of folded-concave PLS
Lasso and L Regularization
Nonnegative garrote
Lasso
Adaptive Lasso
Elastic Net
Dantzig selector
SLOPE and Sorted Penalties
Concentration inequalities and uniform convergence
A brief history of model selection
Bayesian Variable Selection
Bayesian view of the PLS
A Bayesian framework for selection
Numerical Algorithms
Quadratic programs
Least angle regression_
Local quadratic approximations
Local linear algorithm
Penalized linear unbiased selection_
Cyclic coordinate descent algorithms
Iterative shrinkage-thresholding algorithms
Projected proximal gradient method
ADMM
Iterative Local Adaptive Majorization and Minimization
Other Methods and Timeline
Regularization parameters for PLS
Degrees of freedom
Extension of information criteria
Application to PLS estimators
Residual variance and refitted cross-validation
Residual variance of Lasso
Refitted cross-validation
Extensions to Nonparametric Modeling
Structured nonparametric models
Group penalty
Applications
Bibliographical notes
Exercises
4. Penalized Least Squares: Properties
Performance Benchmarks
Performance measures
Impact of model uncertainty
Bayes lower bounds for orthogonal design
Minimax lower bounds for general design
Performance goals, sparsity and sub-Gaussian noise
Penalized L Selection
Lasso and Dantzig Selector
Selection consistency
Prediction and coefficient estimation errors
Model size and least squares after selection
Properties of the Dantzig selector
Regularity conditions on the design matrix
Properties of Concave PLS
Properties of penalty functions
Local and oracle solutions
Properties of local solutions
Global and approximate global solutions
Smaller and Sorted Penalties
Sorted concave penalties and its local approximation
Approximate PLS with smaller and sorted penalties
Properties of LLA and LCA
Bibliographical notes
Exercises
5. Generalized Linear Models and Penalized Likelihood
Generalized Linear Models
Exponential family
Elements of generalized linear models
Maximum likelihood
Computing MLE: Iteratively reweighed least squares
Deviance and Analysis of Deviance
Residuals
Examples
Bernoulli and binomial models
Models for count responses
Models for nonnegative continuous responses
Normal error models
Sparest solution in high confidence set
A general setup
Examples
Properties
Variable Selection via Penalized Likelihood
Algorithms