WeiYa's Work Yard

A traveler with endless curiosity, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

scDRS: single-cell disease relevance score

Posted on (Update: ) 0 Comments
Tags: Single-cell

This note is for Zhang, M. J., Hou, K., Dey, K. K., Sakaue, S., Jagadeesh, K. A., Weinand, K., Taychameekiatchai, A., Rao, P., Pisco, A. O., Zou, J., Wang, B., Gandal, M., Raychaudhuri, S., Pasaniuc, B., & Price, A. L. (2022). Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data. Nature Genetics, 54(10), 1572–1580. https://doi.org/10.1038/s41588-022-01167-z

single-cell disease relevance score (scDRS)

links scRNA-seq with polygeneic disease risk at single-cell resolution, independent of annotated cell types

  • scDRS identifies cells exhibiting excess expression across disease-associated genes impacted by GWASs
  • applies to 74 diseases/traits and 1.3 million single-cell gene-expression profiles across 31 tissues/organs

  • $n_{cell}$ cells
  • $n_{gene}$ genes
  • cell-gene matrix $\bfX \in \IR^{n_{cell}\times n_{gene}}$
  • $X_{cg}$ represents the expression level of cell $c$ and gene $g$
  • $\bfX$ is size-factor-normalized and log-transformed from the original raw count matrix
  • regress the covariates out from the normalized data

given a disease GWAS and an scRNA-seq data set

  • compute a p-value for each individual cell for association with the disease
  • output cell-level normalized disease scores and B sets of normalized control scores that can be used for data visualization and MC-based statistical inference

consists of three steps

  1. construct a set of putative disease genes from the GWAS summary statistics
  2. compute a raw disease score and B MC samples of raw control scores for each cell
  3. after gene set-wise and cell-wise normalization, scDRS computes an association p-value for each cell by comparing its normalized disease score to the empirical distribution of the pooled normalized control scores across all control gene sets and all cells

Published in categories