"This textbook is a well-rounded, rigorous, and informative work presenting the mathematics behind modern machine learning techniques. It hits all the right notes: the choice of topics is up-to-date and perfect for a course on data science for mathematics students at the advanced undergraduate or early graduate level. This book fills a sorely-needed gap in the existing literature by not sacrificing depth for breadth, presenting proofs of major theorems and subsequent derivations, as well as providing a copious amount of Python code. I only wish a book like this had been around when I first began my journey!" -Nicholas Hoell, University of Toronto
"This is a well-written book that provides a deeper dive into data-scientific methods than many introductory texts. The writing is clear, and the text logically builds up regularization, classification, and decision trees. Compared to its probable competitors, it carves out a unique niche. -Adam Loy, Carleton College
The purpose of Data Science and Machine Learning: Mathematical and Statistical Methods is to provide an accessible, yet comprehensive textbook intended for students interested in gaining a better understanding of the mathematics and statistics that underpin the rich variety of ideas and machine learning algorithms in data science.
Key Features:
- Focuses on mathematical understanding.
- Presentation is self-contained, accessible, and comprehensive.
- Extensive list of exercises and worked-out examples.
- Many concrete algorithms with Python code.
- Full color throughout.
Further Resources can be found on the authors website: https://github.com/DSML-book/Lectures
Autorentext
Dirk P. Kroese, PhD, is a Professor of Mathematics and Statistics at The University of Queensland. He has published over 120 articles and five books in a wide range of areas in mathematics, statistics, data science, machine learning, and Monte Carlo methods. He is a pioneer of the well-known Cross-Entropy method-an adaptive Monte Carlo technique, which is being used around the world to help solve difficult estimation and optimization problems in science, engineering, and finance.
Zdravko Botev, PhD, is an Australian Mathematical Science Institute Lecturer in Data Science and Machine Learning with an appointment at the University of New South Wales in Sydney, Australia. He is the recipient of the 2018 Christopher Heyde Medal of the Australian Academy of Science for distinguished research in the Mathematical Sciences.
Thomas Taimre, PhD, is a Senior Lecturer of Mathematics and Statistics at The University of Queensland. His research interests range from applied probability and Monte Carlo methods to applied physics and the remarkably universal self-mixing effect in lasers. He has published over 100 articles, holds a patent, and is the coauthor of Handbook of Monte Carlo Methods (Wiley).
Radislav Vaisman, PhD, is a Lecturer of Mathematics and Statistics at The University of Queensland. His research interests lie at the intersection of applied probability, machine learning, and computer science. He has published over 20 articles and two books.
Inhalt
Preface
Notation
Importing, Summarizing, and Visualizing Data
Introduction
Structuring Features According to Type
Summary Tables
Summary Statistics
Visualizing Data
Plotting Qualitative Variables
Plotting Quantitative Variables
Data Visualization in a Bivariate Setting
Exercises
Statistical Learning
Introduction
Supervised and Unsupervised Learning
Training and Test Loss
Tradeoffs in Statistical Learning
Estimating Risk
In-Sample Risk
Cross-Validation
Modeling Data
Multivariate Normal Models
Normal Linear Models
Bayesian Learning
Exercises
Monte Carlo Methods
Introduction .
Monte Carlo Sampling
Generating Random Numbers
Simulating Random Variables
Simulating Random Vectors and Processes
Resampling
Markov Chain Monte Carlo
Monte Carlo Estimation
Crude Monte Carlo
Bootstrap Method
Variance Reduction
Monte Carlo for Optimization
Simulated Annealing
Cross-Entropy Method
Splitting for Optimization
Noisy Optimization
Exercises
Unsupervised Learning
Introduction
Risk and Loss in Unsupervised Learning
Expectation-Maximization (EM) Algorithm
Empirical Distribution and Density Estimation
Clustering via Mixture Models
Mixture Models
EM Algorithm for Mixture Models
Clustering via Vector Quantization
K-Means
Clustering via Continuous Multiextremal Optimization
Hierarchical Clustering
Principal Component Analysis (PCA)
Motivation: Principal Axes of an Ellipsoid
PCA and Singular Value Decomposition (SVD)
Exercises
Regression
Introduction
Linear Regression
Analysis via Linear Models
Parameter Estimation
Model Selection and Prediction
Cross-Validation and Predictive Residual Sum of Squares
In-Sample Risk and Akaike Information Criterion
Categorical Features
Nested Models
Coefficient of Determination
Inference for Normal Linear Models
Comparing Two Normal Linear Models
Confidence and Prediction Intervals
Nonlinear Regression Models
Linear Models in Python
Modeling
Analysis
Analysis of Variance (ANOVA)
Confidence and Prediction Intervals
Model Validation
Variable Selection
Generalized Linear Models
Exercises
Regularization and Kernel Methods
Introduction
Regularization
Reproducing Kernel Hilbert Spaces
Construction of Reproducing Kernels
Reproducing Kernels via Feature Mapping
Kernels from Characteristic Functions
Reproducing Kernels Using Orthonormal Features
Kernels from Kernels
Representer Theorem
Smoothing Cubic Splines
Gaussian Process Regression
Kernel PCA
Exercises
Classification
Introduction
Classification Metrics
Classification via Bayes' Rule
Linear and Quadratic Discriminant Analysis
Logistic Regression and Softmax Classification
K-nearest Neighbors Classification
Support Vector Machine
Classification with Scikit-Learn
Exercises
Decision Trees and Ensemble Methods
Introduction
Top-Down Construction of Decision Trees
Regional Prediction Functions
Splitting Rules
Termination Criterion
Basic Implementation
Additional Considerations
Binary Versus Non-Binary Trees
Data Preprocessing
Alternative Splitting Rules
Categorical Variables
Missing Values
Controlling the Tree Shape
Cost-Complexity Pruning
Advantages and Limitations of Decision Trees
Bootstrap Aggregation
Random Forests
Boosting
Exercises
Deep Learning
Introduction
Feed-Forward Neural Networks
Back-Propagation
Methods for Training
Steepest Descent
Levenberg-Marquardt Method
Limited-Memory BFGS Method
Adaptive Gradient Methods
Examples in Python
Simple Polynomial Regression
Image Classif…