Cluster Analysis of Transcriptomic Datasets of IPF
Kraven, L. M., Taylor, A. R., Molyneaux, P. L., Maher, T. M., McDonough, J. E., Mura, M., Yang, I. V., Schwartz, D. A., Huang, Y., Noth, I., Ma, S. F., Yeo, A. J., Fahy, W. A., Jenkins, R. G., & Wain, L. V. (2023). Cluster analysis of transcriptomic datasets to identify endotypes of idiopathic pulmonary fibrosis. Thorax, 78(6), 551–558.
considerable clinical heterogeneity in IPF suggests the existence of multiple disease endotypes.
co-normalized, pooled, and clustered three publicly available blood transcriptomic datasets (total 220 IPF cases)
compare clinical traits across clusters and used gene enrichment analysis to identify biological pathways and processes that were over-represented among the genes that were differentially expressed across clusters
A gene-based classifier was developed and validated using three additional independent datasets (total 194 IPF cases)
- identified three clusters of patients with IPF with statistically significant differences in lung function and mortality between groups
- developed and validated a 13-gene cluster classifier that predicted mortality in IPF (high-risk clusters vs low-risk clusters: HR 4.25) ? three groups, how to define low-risk and high-risk cluster?.
- identify blood gene expression signatures capable of discerning groups of patients with IPF with significant differences in survival
- co-normalized the discovery datasets using the COmbat CO-Normalization Using conTrols (COCONUT) method
- the Combined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL) to identify the optimal number of clusters within the pooled, co-normalized data
- develop a gene expression-based classifier
compare the classifier’s performance at predicting survival in IPF