In order to carry out data analytics, we need powerful and flexible computing software. However the software available for data analytics is often proprietary and can be expensive. This book reviews Apache tools, which are open source and easy to use. After providing an overview of the background of data analytics, covering the different types of analysis and the basics of using Hadoop as a tool, it focuses on different Hadoop ecosystem tools, like Apache Flume, Apache Spark, Apache Storm, Apache Hive, R, and Python, which can be used for different types of analysis. It then examines the different machine learning techniques that are useful for data analytics, and how to visualize data with different graphs and charts.

Presenting data analytics from a practice-oriented viewpoint, the book discusses useful tools and approaches for data analytics, supported by concrete code examples. The book is a valuable reference resource for graduate students and professionals inrelated fields, and is also of interest to general readers with an understanding of data analytics.



Autorentext
Dr. Krishnarajanagar GopalaIyengar Srinivasa is an associate professor and the head of the Department of IT at C.B.P. Government Engineering College, Jaffarpur, New Delhi, India. His other publications include the Springer book Guide to High Performance Distributed Computing.
 
Dr. Gaddadevara Matt Siddesh is an associate professor at the Department of Information Science and Engineering at Ramaiah Institute of Technology, Bangalore, India.
 
Srinidhi Hiriyannaiah is an assistant professor at the Department of Computer Science and Engineering at Ramaiah Institute of Technology, Bangalore, India.


Inhalt

Part I: Data Analytics and Hadoop

Introduction to Data Analytics

Introduction to Hadoop

Data Analytics with Map Reduce

Part II: Tools for Data Analytics

Apache Pig

Apache Hive

Apache Spark

Apache Flume

Apache Storm

Python

R

Part III: Machine Learning for Data Analytics

Basics of Machine Learning

Linear Regression

Logistic Regression

Machine Learning on Spark

Part IV: Exploring and Visualizing Data

Introduction to Visualization

Principles of Data Visualization

Visualization Charts

Popular Visualization Tools

Data Visualization with Hadoop

Part V: Case Studies

Product Recommendation

Market Basket Analysis

Titel
Network Data Analytics
Untertitel
A Hands-On Approach for Application Development
EAN
9783319778006
Format
E-Book (pdf)
Veröffentlichung
26.04.2018
Digitaler Kopierschutz
Wasserzeichen
Dateigrösse
8.92 MB
Anzahl Seiten
398