Big Data in Omics and Imaging: Integrated Analysis and Causal Inference addresses the recent development of integrated genomic, epigenomic and imaging data analysis and causal inference in big data era. Despite significant progress in dissecting the genetic architecture of complex diseases by genome-wide association studies (GWAS), genome-wide expression studies (GWES), and epigenome-wide association studies (EWAS), the overall contribution of the new identified genetic variants is small and a large fraction of genetic variants is still hidden. Understanding the etiology and causal chain of mechanism underlying complex diseases remains elusive. It is time to bring big data, machine learning and causal revolution to developing a new generation of genetic analysis for shifting the current paradigm of genetic analysis from shallow association analysis to deep causal inference and from genetic analysis alone to integrated omics and imaging data analysis for unraveling the mechanism of complex diseases.

FEATURES

  • Provides a natural extension and companion volume to Big Data in Omic and Imaging: Association Analysis, but can be read independently.
  • Introduce causal inference theory to genomic, epigenomic and imaging data analysis
  • Develop novel statistics for genome-wide causation studies and epigenome-wide causation studies.
  • Bridge the gap between the traditional association analysis and modern causation analysis
  • Use combinatorial optimization methods and various causal models as a general framework for inferring multilevel omic and image causal networks
  • Present statistical methods and computational algorithms for searching causal paths from genetic variant to disease
  • Develop causal machine learning methods integrating causal inference and machine learning
  • Develop statistics for testing significant difference in directed edge, path, and graphs, and for assessing causal relationships between two networks

The book is designed for graduate students and researchers in genomics, epigenomics, medical image, bioinformatics, and data science. Topics covered are: mathematical formulation of causal inference, information geometry for causal inference, topology group and Haar measure, additive noise models, distance correlation, multivariate causal inference and causal networks, dynamic causal networks, multivariate and functional structural equation models, mixed structural equation models, causal inference with confounders, integer programming, deep learning and differential equations for wearable computing, genetic analysis of function-valued traits, RNA-seq data analysis, causal networks for genetic methylation analysis, gene expression and methylation deconvolution, cell -specific causal networks, deep learning for image segmentation and image analysis, imaging and genomic data analysis, integrated multilevel causal genomic, epigenomic and imaging data analysis.



Autorentext

Momiao Xiong is a professor of Biostatistics at the University of Texas Health Science Center in Houston where he has worked since 1997. He received his PhD in 1993 from the University of Georgia.



Inhalt

1. Genotype-Phenotype Network Analysis

Undirected Graphs for Genotype Network

Gaussian Graphic Model

Alternating Direction Method of Multipliers for Estimation of Gaussian Graphical Model

Coordinate Descent Algorithm and Graphical Lasso

Multiple Graphical Models

Directed Graphs and Structural Equation Models for Networks

Directed Acyclic Graphs

Linear Structural Equation Models

Estimation Methods

Sparse Linear Structural Equations

Penalized Maximum Likelihood Estimation

Penalized Two Stage Least Square Estimation

Penalized Three Stage Least Square Estimation

Functional Structural Equation Models for Genotype-Phenotype Networks

Functional Structural Equation Models

Group Lasso and ADMM for Parameter Estimation in the Functional Structural Equation Models

Causal Calculus

Effect Decomposition and Estimation

Graphical Tools for Causal Inference in Linear SEMs

Identification and Single-door Criterion

Instrument Variables

Total Effects and Backdoor Criterion

Counterfactuals and Linear SEMs

Simulations and Real Data Analysis

Simulations for Model Evaluation

Application to Real Data Examples

Appendix 1A

Appendix 1B

Exercises

Figure Legend

2 Causal analysis and network biology

Bayesian Networks as a General Framework for Causal Inference

Parameter Estimation and Bayesian Dirichlet Equivalent Uniform Score for Discrete Bayesian Networks

Structural Equations and Score Metrics for Continuous Causal Networks

Multivariate SEMs for Generating Node Core Metrics

Mixed SEMs for Pedigree-based Causal Inference

Bayesian Networks with Discrete and Continuous Variable

Two-class Network Penalized Logistic Regression for Learning Hybrid Bayesian Networks

Multiple Network Penalized Functional Logistic Regression Models for NGS Data

Multi-class Network Penalized Logistic Regression for Learning Hybrid Bayesian Networks

Other Statistical Models for Quantifying Node Score Function

Integer Programming for Causal Structure Leaning

Introduction

Integer Linear Programming Formulation of DAG Learning

Cutting Plane for Integer Linear Programming

Branch and Cut Algorithm for Integer Linear Programming

Sink Finding Primal Heuristic Algorithm

Simulations and Real Data Analysis

Simulations

Real Data Analysis

Figure Legend

Software Package

Appendix 2A Introduction to Smoothing Splines

Smoothing Spline Regression for a Single Variable

Smoothing Spline Regression for Multiple Variables

Appendix 2B Penalized Likelihood Function for Jointly Observational and Interventional Data

Exercises

Figure Legend

3. Wearable Computing and Genetic Analysis of Function-valued Traits

Classification of Wearable Biosensor Data

Introduction

Functional Data Analysis for Classification of Time Course Wearable Biosensor Data

Differential Equations for Extracting Features of the Dynamic Process and for Classification of Time Course Data

Deep Learning for Physiological Time Series Data Analysis

Association Studies of Function-Valued Traits

Introduction

Functional Linear Models with both Functional Response and Predictors for Association Analysis of Function-valued Traits

Test Statistics

Null Distribution of Test Statistics

Power

Real Data Analysis

Association Analysis of Multiple Function-valued Traits

Gene-gene Interaction Analysis of Function-Valued Traits

Introduction

Functional Regression Models

Estimation of Interaction Effect Function

Test Statistics

Simulations

Real Data Analysis

Figure Legend

Appendix 3.A Gradient Methods for Parameter Estimation in the Convolutional Neural

Networks

Multilayer Feedforward Pass

Backpropagation Pass

Convolutional Layer

Exercises

4. RNA-seq Data Analysis

Normalization Methods on RNA-seq Data Analysis

Gene Expression

RNA Sequencing Expression Profiling

Methods for Normalization

Differential Expression Analysis for RNA-Seq Data

Distribution-based Approach to Differential Expression Anal…

Titel
Big Data in Omics and Imaging
Untertitel
Integrated Analysis and Causal Inference
EAN
9781351172622
Format
E-Book (epub)
Veröffentlichung
14.06.2018
Digitaler Kopierschutz
Adobe-DRM
Anzahl Seiten
766