WeiYa's Work Yard

A dog, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

condiments: Trajectory Inference across Multiple Conditions

Posted on
Tags: Pseudotime, Single-Cell, Differential Expression

The note is for Van den Berge, K., Roux de BĂ©zieux, H., Street, K., Saelens, W., Cannoodt, R., Saeys, Y., Dudoit, S., & Clement, L. (2020). Trajectory-based differential expression analysis for single-cell sequencing data. Nature Communications, 11(1), Article 1.

  1. differential topology: consider the trajectory inference problem, assessing whether the dynamic process is fundamentally different between conditions
  2. differential progression and differential differentiation: test for differential abundance of the different conditions along lineages and between lineages
  3. differential expression: estimate gene expression profiles and tests whether gene expression patterns differ between conditions along lineages

$J$ genes in $n$ cells, resulting in a $J\times n$ count matrix $\bY$. For each cell $i$, we know its condition label $c(i)\in{1,\ldots, C}$

Assume that for each condition $c$, there is an underlying developmental structure $\cT_C$, or trajectory, that possesses a set of $L_c$ lineages.

For a given cell $i$ with condition $c(i)$, its position along the developmental path $\cT_{c(i)}$ is defined by a vector of $L_{c(i)}$ pseudotimes $\bT_i$ and a unit-norm vector of $L_{c(i)}$ weights $\bW_i (\Vert \bW_i\Vert_1 = 1)$ (i.e., there is one pseudotime and one weight per lineage), with

\[\bT_i\sim G_{c(i)}\quad \text{and}\quad \bW_i\sim H_{c(i)}\]

The cumulative distribution functions (CDFs) $G_c$ and $H_c$ are condition-specific.

The pseudotime values represent how far a cell has progressed along each lineage, while the weights represent how likely it is that a cell belongs to each lineage.

Published in categories Note