# Bipartitle eQTL Network Construction

##### Posted on

evaluate eQTL by modeling the association between SNP genotypes and gene expression.

- an $r\times n$ matrix $S$ of SNP genotypes
- $r\times m$ matrix $G$ of gene expression

each with $r$ rows representing observations and columns representing $n$ SNPs and $m$ genes, respectively

consider a covariate matrix $X$, including features such as principal components for population structure, sex and age.

model the eQTL of a particular SNP $i$ on a locusâ€™s gene expression $j$

\[G_j = X^\top\alpha + \beta_{ij}S_i\]the eQTL association between all pairs of SNPs and genes can be represented as a bipartitle network by considering each SNP $i$ and gene $j$ to be a node in the network, and casting a function of their association as edges

define a set of adjacency matrix representations based on summary statistics from eQTL analyses

\[a_{ij} = \vert z_{ij}\vert I\{Y_{ij} < \tau\}\,,\]where

- $z_{ij}$ is either set equal to 1 for an unweighted representation or the $z$-statistic for testing $\beta_{ij}$ from the eQTL regression between SNP $i$ and gene $j$ for a weighted representation
- $Y_{ij}$ is a measure of the significance of the eQTL association

three definitions of $Y$

to identify nodes (either SNPs or genes in the bipartite representation) that are central to the network

consider the network metric of degree

for the sparse representation of $A$, the degree of SNP $i$ and the degree of gene $j$ are defined as follows