Cluster Analysis of Transcriptomic Datasets of IPF

Tags: Clustering, IPF

considerable clinical heterogeneity in IPF suggests the existence of multiple disease endotypes.

methods

co-normalized, pooled, and clustered three publicly available blood transcriptomic datasets (total 220 IPF cases)
compare clinical traits across clusters and used gene enrichment analysis to identify biological pathways and processes that were over-represented among the genes that were differentially expressed across clusters
A gene-based classifier was developed and validated using three additional independent datasets (total 194 IPF cases)

findings:

identified three clusters of patients with IPF with statistically significant differences in lung function and mortality between groups
developed and validated a 13-gene cluster classifier that predicted mortality in IPF (high-risk clusters vs low-risk clusters: HR 4.25) ? three groups, how to define low-risk and high-risk cluster?.

interpretation:

identify blood gene expression signatures capable of discerning groups of patients with IPF with significant differences in survival

discovery stage

co-normalized the discovery datasets using the COmbat CO-Normalization Using conTrols (COCONUT) method
the Combined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL) to identify the optimal number of clusters within the pooled, co-normalized data
develop a gene expression-based classifier

compare the classifier’s performance at predicting survival in IPF

Published in categories Note

← previous next →