In just 24 lessons of one hour or less, Sams Teach Yourself R in 24 Hours helps you learn all the R skills you need to solve a wide spectrum of real-world data analysis problems. You'll master the entire data analysis workflow, learning to build code that's efficient, reproducible, and suitable for sharing with others.
This book's straightforward, step-by-step approach teaches you how to import, manipulate, summarize, model, and plot data with R; formalize your analytical code; and build powerful R packages using current best practices.
Practical, hands-on examples show you how to apply what you learn. Quizzes and exercises help you test your knowledge and stretch your skills.
Learn How To
- Install, configure, and explore the R environment, including RStudio
- Use basic R syntax, objects, and packages
- Create and manage data structures, including vectors, matrices, and arrays
- Understand lists and data frames
- Work with dates, times, and factors
- Use common R functions, and learn to write your own
- Import and export data and connect to databases and spreadsheets
- Use the popular tidyr, dplyr and data.table packages
- Write more efficient R code with profiling, vectorization, and initialization
- Plot data and extend your graphical capabilities with ggplot2 and Lattice graphics
- Develop common types of models
- Construct high-quality packages, both simple and complex
- Write R classes: S3, S4, and Reference Classes
- Use R to generate dynamic reports
- Build web applications with Shiny
Register your book at informit.com/register for convenient access to updates and corrections as they become available.
This book's source code can be found at http://www.mango-solutions.com/wp/teach-yourself-r-in-24-hours-book/.
Autorentext
Andy Nicholls has a Master of Mathematics degree from the University of Bath and Master of Science in Statistics with Applications in Medicine from the University of Southampton. Andy worked as a Senior Statistician in the pharmaceutical industry for a number of years before joining Mango Solutions as an R consultant in 2011. Since joining Mango, Andy has taught more than 50 on-site R training courses and has been involved in the development of more than 30 R packages. Today, he manages Mango Solution's R consultancy team and continues to be a regular contributor to the quarterly LondonR events, by far the largest R user group in the UK, with over 1,000 meet-up members. Andy lives near the historical city of Bath, UK with his wonderful, tolerant wife and son.
Richard Pugh has a first-class Mathematics degree from the University of Bath. Richard worked as a statistician in the pharmaceutical industry before joining Insightful, the developers of S-PLUS, joining the pre-sales consulting team. Richard's role at Insightful included a variety of activities, providing a range of training and consulting services to blue-chip customers across many sectors. In 2002, Richard co-founded Mango Solutions, developing the company and leading technical efforts around R and other analytic software. Richard is now Mango's Chief Data Scientist and speaks regularly at data science and R events. Richard lives in Bradford on Avon, UK with his wife and two kids, and spends most of his "spare" (ha!) time renovating his house.
Aimee Gott has a PhD in Statistics from Lancaster University where she also completed her undergraduate and master's degrees. As Training Lead, Aimee has delivered over 200 days of training for Mango. She has delivered on-site training courses in Europe and the U.S. in all aspects of R, as well as shorter workshops and online webinars. Aimee oversees Mango's training course development across the data science pipeline, and regularly attends R user groups and meet-ups. In her spare time, Aimee enjoys learning European languages and documenting her travels through photography.
Inhalt
Preface xii HOUR 1: The R Community 1 A Concise History of R 1 The R Community 3 R Development 7 Summary 8 Q&A 8 Workshop 9 Activities 9 HOUR 2: The R Environment 11 Integrated Development Environments 11 R Syntax 14 R Objects 16 Using R Packages 23 Internal Help 28 Summary 29 Q&A 30 Workshop 30 Activities 32 HOUR 3: Single-Mode Data Structures 33 The R Data Types 33 Vectors, Matrices, and Arrays 34 Vectors 35 Matrices 49 Arrays 58 Relationship Between Single-Mode Data Objects 60 Summary 62 Q&A 62 Workshop 63 Activities 64 HOUR 4: Multi-Mode Data Structures 67 Multi-Mode Structures 67 Lists 68 Data Frames 86 Exploring Your Data 93 Summary 98 Q&A 98 Workshop 100 Activities 101 HOUR 5: Dates, Times, and Factors 103 Working with Dates and Times 103 The lubridate Package 107 Working with Categorical Data 108 Summary 112 Q&A 112 Workshop 113 Activities 114 HOUR 6: Common R Utility Functions 115 Using R Functions 115 Functions for Numeric Data 117 Logical Data 121 Missing Data 122 Character Data 123 Summary 125 Q&A 126 Workshop 126 Activities 127 HOUR 7: Writing Functions: Part I 129 The Motivation for Functions 129 Creating a Simple Function 130 The If/Else Structure 136 Summary 146 Q&A 147 Workshop 148 Activities 149 HOUR 8: Writing Functions: Part II 151 Errors and Warnings 151 Checking Inputs 155 The Ellipsis 157 Checking Multivalue Inputs 162 Using Input Definition 164 Summary 168 Q&A 168 Workshop 170 Activities 171 HOUR 9: Loops and Summaries 173 Repetitive Tasks 173 The "apply" Family of Functions 181 The apply Function 183 The lapply Function 195 The sapply Function 204 The tapply Function 208 Summary 213 Q&A 213 Workshop 214 Activities 216 HOUR 10: Importing and Exporting 217 Working with Text Files 217 Relational Databases 223 Working with Microsoft Excel 226 Summary 231 Q&A 232 Workshop 232 Activities 233 HOUR 11: Data Manipulation and Transformation 235 Sorting 236 Appending 237 Merging 238 Duplicate Values 241 Restructuring 242 Data Aggregation 249 Summary 258 Q&A 258 Workshop 259 Activities 259 HOUR 12: Efficient Data Handling in R 261 dplyr: A New Way of Handling Data 261 Efficient Data Handling with data table 273 Summary 282 Q&A 283 Workshop 283 Activities 284 HOUR 13: Graphics 287 Graphics Devices and Colors 287 High-Level Graphics Functions 289 Low-Level Graphics Functions 298 Graphical Parameters 304 Controlling the Layout 305 Summary 308 Q&A 309 Workshop 309 Activities 311 HOUR 14: The ggplot2 Package for Graphics 313 The Philosophy of ggplot2 313 Quick Plots and Basic Control 314 Changing Plot Types 317 Aesthetics 320 Paneling (a k a Faceting) 328 Custom Plots 333 Themes and Layout 338 The ggvis Evolution 342 Summary 342 Q&A 343 Workshop 343 Activities 344 HOUR 15: Lattice Graphics 345 The History of Trellis Graphics 345 The Lattice Package 346 Creating a Simple Lattice Graph 346 Graph Options 356 Multiple Variables 358 Groups of Data 360 Using Panels 362 Controlling Styles 372 Summary 376 Q&A 377 Workshop 378 Activities 378 HOUR 16: Introduction to R Models and Object Orientation 379 Statistical Models in R 379 Simple Linear Models 380 Assessing a Model in R 382 Multiple Linear Regression 391 Interaction Terms 396 Factor Independent Variables 398 Variable Transformations 402 R and Object Orientation 405 Summary 407 Q&A 408 Workshop 408 Activities 409 HOUR 17: Common R Models 411 Generalized Linear Models 411 Nonlinear Models 423 Survival Analysis 430 Time Series Analysis 441 Summary 452 Q&A 452 Workshop 452 Activities 453 HOUR 18: Code Efficiency 455 Determining Efficiency 455 Initialization 458 Vectorization 459 Using Alternative Functions 462…