Examine the latest technological advancements in building a scalable machine-learning model with big data using R. This second edition shows you how to work with a machine-learning algorithm and use it to build a ML model from raw data. You will see how to use R programming with TensorFlow, thus avoiding the effort of learning Python if you are only comfortable with R.
As in the first edition, the authors have kept the fine balance of theory and application of machine learning through various real-world use-cases which gives you a comprehensive collection of topics in machine learning. New chapters in this edition cover time series models and deep learning.
You will:- Understand machine learning algorithms using R
- Master the process of building machine-learning models
- Cover the theoretical foundations of machine-learning algorithms
- See industry focused real-world use cases
- Tackle time series modeling in R
- Apply deep learning using Keras and TensorFlow in R
Autorentext
Karthik Ramasubramanian has over seven years' experience leading data science and business analytics in retail, FMCG, e-commerce, information technology and hospitality for multi-national companies and unicorn startups. A researcher and problem solver with a diverse set of experience in the data science life cycle, starting from a data problem discovery to creating data science PoCs and products for various industry use cases. In his leadership roles, he has been instrumental in solving many ROI-driven business problems through data science solutions. He has mentored and trained hundreds of professionals and students around the world through various online platforms and university engagement programs in data science.
He has designed, developed and spearheaded many A/B experiment frameworks for improving product features, conceptualized funnel analysis for understanding user interactions and identifying the friction points within a product, and designed statistically robust metrics. On the predictive side, he has developed intelligent chatbots based on deep learning models which understands human-like interactions, customer segmentation models, recommendation systems and many natural language processing models.
His current areas of interest include ROI-driven data product development, advanced machine learning algorithms, data product frameworks, Internet of Things (IoT), scalable data platforms, and model deployment frameworks.
Karthik completed his M.Sc. (Theoretical Computer Science) from PSG College of Technology, Coimbatore (Affiliated to Anna University, Chennai), where he pioneered the application of machine learning, data mining and fuzzy logic in his research work on computer and network security.
Abhishek Singh is on a mission to profess the de facto language of this millennium, the numbers. He is on a journey to bring machine closer to human, for a better and beautiful world around us by generating opportunities with artificial intelligence and machine learning. He leads team of data science professionals who are solving pressing problems in food security, cyber security, natural disaster, healthcare and many more areas, all with help of data and technology. Abhishek is in the process of bringing smart IoT devices to smaller cities in India for people to leverage technology to improve their lives.
He has worked with colleagues from many parts of the USA, Europe and Asia, and strives to work with more people from various backgrounds. In a span of six years at big corporates, he has stress tested the assets of US banks, solved insurance pricing models, and made the telecom experience easier for customers, and is now creating data science opportunities with his team of young minds.
He actively participates in analytics-related thought leadership, writing, public speaking, meet-ups and training in data science. He is staunch supporter of responsible use of AI to remove biases and fair use for a better society.
Abhishek completed his MBA from IIM Bangalore, B.Tech. (Mathematics and Computing) from IIT Guwahati, and PG Diploma (Cyber Law) from NALSAR University, Hyderabad.
Inhalt
Chapter 1: Introduction to Machine Learning
Chapter Goal: This chapter walks through the What, Why, Where and How kind of questions, generally asked by many beginners in Machine Learning. The answers will set the momentum and direction for the chapters to follow.
No of pages: 25
Sub -Topics
1.What does a Machine really learn?
2.Why is Machine Learning so popular?
3.Where do we use Machine Learning?
4.How is Machine Learning changing our way of life?
5.Machine Learning Tools and Software
6.Machine Learning using R
Chapter 2: Data Exploration and Preparation
Chapter Goal: The basis for building a good Machine Learning model is to have a clear understanding and well preparedness of data. This chapter will explain ways to explore the data for understanding and how to deal with the inconsistencies present in the data.
No of pages: 50
Sub - Topics
1.Various Data Formats
2.Summary Statistics
3.Missing Values
4.Data Imputation
5.Transforming Unstructured Data
Chapter 3: Sampling and Resampling Techniques
Chapter Goal: In many real-world dataset, the biggest challenge is the sheer volume of the data. This volume makes the computational limitations more evident for building the Machine Learning Models. In order to reduce the need for computational power and at the same time not compromising the efficacy of the model, this chapter explains some sampling techniques for selecting a smaller dataset from the bigger dataset. We will also explore the idea of resampling which increases the accuracy of many Machine Learning Models.
No of pages: 50
Sub - Topics:
1.Simple Random Sampling
2.Systematic Sampling
3.Stratified Sampling
4.Cluster Sampling
5.Bootstrap sampling
Chapter 4: Visualization of Data
Chapter Goal: Visualization is a powerful tool to see through things in our data which might not be very evident when a manual exploration is carried out. This chapter will explain some of the commonly used plots and diagrams to see visually appealing insights coming out from our data.
No of pages: 50
Sub - Topics:
1.Scatterplot, Histogram and Box Plot
2.Heat maps and Waterfall Charts
3.Dendrogram for Clustering
4.Bubble Chart and Word Cloud
5.Sankey Diagrams
6.Time Series Graphs
7.Cohort Diagram
Chapter 5: Feature Engineering
Chapter Goal: One more challenge in the real world dataset is the number of features it contains. There might be hundreds of feature in a dataset but not all of it is useful for building our model. So, in order to select the features which explain our dataset more than the other features, and hence give a more accurate result, we have certain well proven technique derived from statistics. The feature engineering has now become an unavoidable step in our Machine Learning Model building process.
No of pages: 40
Sub - Topics:
1.Feature Ranking
2.Variable Subset Selection
3.Dimensionality Reduction
Chapter 6: Machine Learning Models: Theory and Practice
Chapter Goal: This chapter is the core of this book. After we had the fair unders…