WeiYa's Work Yard

A dog, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

scMDC: Single-cell Multi-omics Data Clustering Analysis

Posted on
Tags: Clustering, Single-cell, Multi-omics

This post is for Lin, X., Tian, T., Wei, Z., & Hakonarson, H. (2022). Clustering of single-cell multi-omics data with a multimodal deep learning method. Nature Communications, 13(1), Article 1.

scMDC: single-cell multi-omics data clustering analysis

  • an end-to-end deep model that explicitly characterizes different data sources
  • jointly learns latent features of deep learning for clustering analysis

multimodal sequencing technologies

  • CITE-seq: cellular Indexing of Transcriptomes and Epitopes by Sequencing
  • REAP-seq: RNA expression and protein sequencing assay
  • scATAT-seq: the development of single-cell approaches for the assay of the transposase accessible chromatin sequencing

some multi-omics single-cell technologies have been developed to jointly profile chromatin accessibility and gene expression within a single cell, such as

  • SNARE-seq
  • 10X Single-cell Multipme ATAC + Gene Expression

in the multimodal data, the biological information provided by different modalities is complementary

  • ADT and miRNA data

Clustering analysis

  • Tscan: PCA on the scRNA-seq and then Gaussian Mixture Model (GMM)
  • Seurat: kNN graph based on the Euclidean distance in PCA space. Then employs the Louvain/Leiden algorithm to iteratively group cells together by optimizing modularity.
  • SC3: spectral clustering based on the distance matrices derived from the Euclidean, Pearson and Spearman metrics, respectively. Compute a consensus matrix. Finally, use hierarchical clustering

clustering analysis of CITE-seq data

  • scDCC: single cell deep constrained clustering framework
  • BREM-SC: hierarchical Bayesian mixture model

similarity matrix-based clustering cannot explicitly consider the dropout events in scRNA-seq data

Another line of research: focuses on learning a joint embedding of different modalities

zero-inflated negative binomial (ZINB)

ZINB can effectively characterize scRNA-seq data and improve the representation learning and clustering results


Published in categories Note