Integrates the theory and applications of statistics using R A Course in Statistics with R has been written to bridge the gap between theory and applications and explain how mathematical expressions are converted into R programs. The book has been primarily designed as a useful companion for a Masters student during each semester of the course, but will also help applied statisticians in revisiting the underpinnings of the subject. With this dual goal in mind, the book begins with R basics and quickly covers visualization and exploratory analysis. Probability and statistical inference, inclusive of classical, nonparametric, and Bayesian schools, is developed with definitions, motivations, mathematical expression and R programs in a way which will help the reader to understand the mathematical development as well as R implementation. Linear regression models, experimental designs, multivariate analysis, and categorical data analysis are treated in a way which makes effective use of visualization techniques and the related statistical techniques underlying them through practical applications, and hence helps the reader to achieve a clear understanding of the associated statistical models.
Key features:
* Integrates R basics with statistical concepts
* Provides graphical presentations inclusive of mathematical expressions
* Aids understanding of limit theorems of probability with and without the simulation approach
* Presents detailed algorithmic development of statistical models from scratch
* Includes practical applications with over 50 data sets
Autorentext
Prabhanjan Tattar , Business Analysis Senior Advisor at Dell International Services, Bangalore, India. Professor Tattar is a statistician providing analytical solutions to business problems inclusive of statistical models and machine learning as appropriate.
Suresh Ramaiah, Assistant Professor of Statistics at Dharwad University, Dharwad, India.
B G Manjunath, Business Analysis Advisor at Dell International Services, Bangalore, India
Zusammenfassung
Integrates the theory and applications of statistics using R A Course in Statistics with R has been written to bridge the gap between theory and applications and explain how mathematical expressions are converted into R programs. The book has been primarily designed as a useful companion for a Masters student during each semester of the course, but will also help applied statisticians in revisiting the underpinnings of the subject. With this dual goal in mind, the book begins with R basics and quickly covers visualization and exploratory analysis. Probability and statistical inference, inclusive of classical, nonparametric, and Bayesian schools, is developed with definitions, motivations, mathematical expression and R programs in a way which will help the reader to understand the mathematical development as well as R implementation. Linear regression models, experimental designs, multivariate analysis, and categorical data analysis are treated in a way which makes effective use of visualization techniques and the related statistical techniques underlying them through practical applications, and hence helps the reader to achieve a clear understanding of the associated statistical models.
Key features:
- Integrates R basics with statistical concepts
- Provides graphical presentations inclusive of mathematical expressions
- Aids understanding of limit theorems of probability with and without the simulation approach
- Presents detailed algorithmic development of statistical models from scratch
- Includes practical applications with over 50 data sets
Inhalt
List of Figures xvii
List of Tables xxi
Preface xxiii
Acknowledgments xxv
Part I THE PRELIMINARIES
1 WhyR? 3
1.1 Why R? 3
1.2 R Installation 5
1.3 There is Nothing such as PRACTICALS 5
1.4 Datasets in R and Internet 6
1.4.1 List of Web-sites containing DATASETS 7
1.4.2 Antique Datasets 8
1.5 http://cran.r-project.org 9
1.5.1 http://r-project.org 10
1.5.2 http://www.cran.r-project.org/web/views/ 10
1.5.3 Is subscribing to R-Mailing List useful? 10
1.6 R and its Interface with other Software 11
1.7 help and/or? 11
1.8 R Books 12
1.9 A Road Map 13
2 The R Basics 15
2.1 Introduction 15
2.2 Simple Arithmetics and a Little Beyond 16
2.2.1 Absolute Values, Remainders, etc. 16
2.2.2 round, floor, etc. 17
2.2.3 Summary Functions 18
2.2.4 Trigonometric Functions 18
2.2.5 Complex Numbers 19
2.2.6 Special Mathematical Functions 21
2.3 Some Basic R Functions 22
2.3.1 Summary Statistics 23
2.3.2 is, as, is.na, etc. 25
2.3.3 factors, levels, etc. 26
2.3.4 Control Programming 27
2.3.5 Other Useful Functions 29
2.3.6 Calculus* 31
2.4 Vectors and Matrices in R 33
2.4.1 Vectors 33
2.4.2 Matrices 36
2.5 Data Entering and Reading from Files 41
2.5.1 Data Entering 41
2.5.2 Reading Data from External Files 43
2.6 Working with Packages 44
2.7 R Session Management 45
2.8 Further Reading 46
2.9 Complements, Problems, and Programs 46
3 Data Preparation and Other Tricks 49
3.1 Introduction 49
3.2 Manipulation with Complex Format Files 50
3.3 Reading Datasets of Foreign Formats 55
3.4 Displaying R Objects 56
3.5 Manipulation Using R Functions 57
3.6 Working with Time and Date 59
3.7 Text Manipulations 62
3.8 Scripts and Text Editors for R 64
3.8.1 Text Editors for Linuxians 64
3.9 Further Reading 65
3.10 Complements, Problems, and Programs 65
4 Exploratory Data Analysis 67
4.1 Introduction: The Tukey's School of Statistics 67
4.2 Essential Summaries of EDA 68
4.3 Graphical Techniques in EDA 71
4.3.1 Boxplot 71
4.3.2 Histogram 76
4.3.3 Histogram Extensions and the Rootogram 79
4.3.4 Pareto Chart 81
4.3.5 Stem-and-Leaf Plot 84
4.3.6 Run Chart 88
4.3.7 Scatter Plot 89
4.4 Quantitative Techniques in EDA 91
4.4.1 Trimean 91
4.4.2 Letter Values 92
4.5 Exploratory Regression Models 95
4.5.1 Resistant Line 95
4.5.2 Median Polish 98
4.6 Further Reading 99
4.7 Complements, Problems, and Programs 100
Part II PROBABILITY AND INFERENCE
5 Probability Theory 105
5.1 Introduction 105
5.2 Sample Space, Set Algebra, and Elementary Probability 106
5.3 Counting Methods 113
5.3.1 Sampling: The Diverse Ways 114
5.3.2 The Binomial Coefficients and the Pascals Triangle 118
5.3.3 Some Problems Based on Combinatorics 119
5.4 Probability: A Definition 122
5.4.1 The Prerequisites 122
5.4.2 The Kolmogorov Definition 127
5.5 Conditional Probability and Independence 130
5.6 Bayes Formula 132
5.7 Random Variables, Expectations, and Moments 133
5.7.1 The Definition 133
5.7.2 Expectation of Random Variables 136
5.8 Distribution Function, Characteristic Function, and Moment Generation Function 143
5.9 Inequalities 145
5.9.1 The Markov Inequality 145
5.9.2 The Jensen's Inequality 145
5.9.3 The Chebyshev Inequality 146
5.10 Convergence of Random Variables 146
5.10.1 Convergence in Distributions 147
5.10.2 Convergence in Probability 150
5.10.3 Convergence in rth Mean 150
5.10.4 Almost Sure Convergence 151
5.11 The Law of Large Numbers 152
5.11.1 The Weak Law of Large Num…