Hands-on Machine Learning with R provides a practical and applied approach to learning and developing intuition into today's most popular machine learning methods. This book serves as a practitioner's guide to the machine learning process and is meant to help the reader learn to apply the machine learning stack within R, which includes using various R packages such as glmnet, h2o, ranger, xgboost, keras, and others to effectively model and gain insight from their data. The book favors a hands-on approach, providing an intuitive understanding of machine learning concepts through concrete examples and just a little bit of theory.
Throughout this book, the reader will be exposed to the entire machine learning process including feature engineering, resampling, hyperparameter tuning, model evaluation, and interpretation. The reader will be exposed to powerful algorithms such as regularized regression, random forests, gradient boosting machines, deep learning, generalized low rank models, and more! By favoring a hands-on approach and using real word data, the reader will gain an intuitive understanding of the architectures and engines that drive these algorithms and packages, understand when and how to tune the various hyperparameters, and be able to interpret model results. By the end of this book, the reader should have a firm grasp of R's machine learning stack and be able to implement a systematic approach for producing high quality modeling results.
Features:
· Offers a practical and applied introduction to the most popular machine learning methods.
· Topics covered include feature engineering, resampling, deep learning and more.
· Uses a hands-on approach and real world data.
Autorentext
Brad Boehmke is a data scientist at 84.51° where he wears both software developer and machine learning engineer hats. He is an Adjunct Professor at the University of Cincinnati, author of Data Wrangling with R, and creator of multiple public and private enterprise R packages.
Brandon Greenwell is a data scientist at 84.51° where he works on a diverse team to enable, empower, and encourage others to successfully apply machine learning to solve real business problems. He's part of the Adjunct Graduate Faculty at Wright State University, an Adjunct Instructor at the University of Cincinnati, and the author of several R packages available on CRAN.
Klappentext
Hands-on Machine Learning with R provides a practical and applied approach to learning and developing intuition into today's most popular machine learning methods. This book serves as a practitioner's guide to the machine learning process and is meant to help the reader learn to apply the machine learning stack within R, which includes using various R packages such as glmnet, h2o, ranger, xgboost, keras, and others to effectively model and gain insight from their data. The book favors a hands-on approach, providing an intuitive understanding of machine learning concepts through concrete examples and just a little bit of theory.
Throughout this book, the reader will be exposed to the entire machine learning process including feature engineering, resampling, hyperparameter tuning, model evaluation, and interpretation. The reader will be exposed to powerful algorithms such as regularized regression, random forests, gradient boosting machines, deep learning, generalized low rank models, and more! By favoring a hands-on approach and using real word data, the reader will gain an intuitive understanding of the architectures and engines that drive these algorithms and packages, understand when and how to tune the various hyperparameters, and be able to interpret model results. By the end of this book, the reader should have a firm grasp of R's machine learning stack and be able to implement a systematic approach for producing high quality modeling results.
Features:
- Offers a practical and applied introduction to the most popular machine learning methods.
- Takes readers through the entire modeling process; from data prep to hyperparameter tuning, model evaluation, and interpretation.
- Introduces readers to a wide variety of packages that make up R's machine learning stack.
- Uses a hands-on approach and real world data.
Brad Boehmke is a data scientist at 84.51° where he wears both software developer and machine learning engineer hats. He is an Adjunct Professor at the University of Cincinnati, author of Data Wrangling with R, and creator of multiple public and private enterprise R packages.
Brandon Greenwell is a data scientist at 84.51° where he works on a diverse team to enable, empower, and encourage others to successfully apply machine learning to solve real business problems. He's part of the Adjunct Graduate Faculty at Wright State University, an Adjunct Instructor at the University of Cincinnati, and the author of several R packages available on CRAN.
Inhalt
FUNDAMENTALS
Introduction to Machine Learning
Supervised learning
Regression problems
Classification problems
Unsupervised learning
Roadmap
The data sets
Modeling Process
Prerequisites
Data splitting
Simple random sampling
Stratified sampling
Class imbalances
Creating models in R
Many formula interfaces
Many engines
Resampling methods
Contents
k-fold cross validation
Bootstrapping
Alternatives
Bias variance trade-off
Bias
Variance
Hyperparameter tuning
Model evaluation
Regression models
Classification models
Putting the processes together
Feature & Target Engineering
Prerequisites
Target engineering
Dealing with missingness
Visualizing missing values
Imputation
Feature filtering
Numeric feature engineering
Skewness
Standardization
Categorical feature engineering
Lumping
One-hot & dummy encoding
Label encoding
Alternatives
Dimension reduction
Proper implementation
Sequential steps
Data leakage
Putting the process together
Contents v
SUPERVISED LEARNING
Linear Regression
Prerequisites
Simple linear regression
Estimation
Inference
Multiple linear regression
Assessing model accuracy
Model concerns
Principal component regression
Partial least squares
Feature interpretation
Final thoughts
Logistic Regression
Prerequisites
Why logistic regression
Simple logistic regression
Multiple logistic regression
Assessing model accuracy
Model concerns
Feature interpretation
Final thoughts
Regularized Regression
Prerequisites
Why regularize?
Ridge penalty
Lasso penalty
Elastic nets
Implementation
vi Contents
Tuning
Feature interpretation
Attrition data
Final thoughts
Multivariate Adaptive Regression Splines
Prerequisites
The basic idea
Multivariate regression splines
Fitting a basic MARS model
Tuning
Feature interpretation
Attrition data
Final thoughts
K-Neare…