WeiYa's Work Yard

A dog, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Bipartitle eQTL Network Construction

Posted on
Tags: eQTL, Bipartitle Graph

This post is for Gaynor, S. M., Fagny, M., Lin, X., Platig, J., & Quackenbush, J. (2022). Connectivity in eQTL networks dictates reproducibility and genomic properties. Cell Reports Methods, 2(5), 100218.

evaluate eQTL by modeling the association between SNP genotypes and gene expression.

  • an $r\times n$ matrix $S$ of SNP genotypes
  • $r\times m$ matrix $G$ of gene expression

each with $r$ rows representing observations and columns representing $n$ SNPs and $m$ genes, respectively

consider a covariate matrix $X$, including features such as principal components for population structure, sex and age.

model the eQTL of a particular SNP $i$ on a locus’s gene expression $j$

\[G_j = X^\top\alpha + \beta_{ij}S_i\]

the eQTL association between all pairs of SNPs and genes can be represented as a bipartitle network by considering each SNP $i$ and gene $j$ to be a node in the network, and casting a function of their association as edges

define a set of adjacency matrix representations based on summary statistics from eQTL analyses

\[a_{ij} = \vert z_{ij}\vert I\{Y_{ij} < \tau\}\,,\]

where

  • $z_{ij}$ is either set equal to 1 for an unweighted representation or the $z$-statistic for testing $\beta_{ij}$ from the eQTL regression between SNP $i$ and gene $j$ for a weighted representation
  • $Y_{ij}$ is a measure of the significance of the eQTL association

three definitions of $Y$

to identify nodes (either SNPs or genes in the bipartite representation) that are central to the network

consider the network metric of degree

for the sparse representation of $A$, the degree of SNP $i$ and the degree of gene $j$ are defined as follows


Published in categories Note