WeiYa's Work Yard

A dog, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

sctransform: Normalization using Regularized Negative Binomial Regression

February 24, 2024 (Update: )

The note is for Hafemeister, C., & Satija, R. (2019). Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biology, 20(1), 296.

Continue reading



BLiP: Bayesian Linear Programming

February 09, 2024

The note is for Spector, A., & Janson, L. (2023). Controlled Discovery and Localization of Signals via Bayesian Linear Programming (arXiv:2203.17208). arXiv.

Continue reading



Post-clustering Inference under Dependency

February 08, 2024

This post is for González-Delgado, J., Cortés, J., & Neuvial, P. (2023). Post-clustering Inference under Dependency (arXiv:2310.11822). arXiv.

Continue reading



Bipartitle eQTL Network Construction

February 08, 2024

This post is for Gaynor, S. M., Fagny, M., Lin, X., Platig, J., & Quackenbush, J. (2022). Connectivity in eQTL networks dictates reproducibility and genomic properties. Cell Reports Methods, 2(5), 100218.

Continue reading



Selective Inference for Hierarchical Clustering

February 08, 2024

This note is for Gao, L. L., Bien, J., & Witten, D. (2022). Selective Inference for Hierarchical Clustering (arXiv:2012.02936). arXiv.

Continue reading



Contrasting Genetic Architectures using Fast Variance Components Analysis

February 07, 2024

This note is for Loh, P.-R., Bhatia, G., Gusev, A., Finucane, H. K., Bulik-Sullivan, B. K., Pollack, S. J., de Candia, T. R., Lee, S. H., Wray, N. R., Kendler, K. S., O’Donovan, M. C., Neale, B. M., Patterson, N., & Price, A. L. (2015). Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance components analysis. Nature Genetics, 47(12), 1385–1392.

Continue reading



Joint Model in High Dimension

January 31, 2024 (Update: )

This post is for Liu, M., Sun, J., Herazo-Maya, J. D., Kaminski, N., & Zhao, H. (2019). Joint Models for Time-to-Event Data and Longitudinal Biomarkers of High Dimension. Statistics in Biosciences, 11(3), 614–629.

Continue reading



BAMLSS: Flexible Bayesian Additive Joint Model

January 31, 2024 (Update: )

This post is for Köhler, M., Umlauf, N., Beyerlein, A., Winkler, C., Ziegler, A.-G., & Greven, S. (2017). Flexible Bayesian additive joint models with an application to type 1 diabetes research. Biometrical Journal, 59(6), 1144–1165.

Continue reading



Effective Gene Expression Prediction

January 26, 2024 (Update: )

This note is for Avsec, Ž., Agarwal, V., Visentin, D., Ledsam, J. R., Grabska-Barwinska, A., Taylor, K. R., Assael, Y., Jumper, J., Kohli, P., & Kelley, D. R. (2021). Effective gene expression prediction from sequence by integrating long-range interactions. Nature Methods, 18(10), 1196–1203.

Continue reading



Edgeworth Expansion

January 24, 2024

This note is based on Shao, J. (2003). Mathematical statistics (2nd ed). Springer. and Hwang, J. (2019). Note on Edgeworth Expansions and Asymptotic Refinements of Percentile t-Bootstrap Methods. Bootstrap Methods.

Continue reading



t-Test for Mixture Normal Data

January 23, 2024

The post is for Lee, A. F. S., & Gurland, J. (1977). One-Sample t-Test When Sampling from a Mixture of Normal Distributions. The Annals of Statistics, 5(4), 803–807.

Continue reading



Fine-mapping from Summary Data with SuSiE

January 22, 2024

This post is for Zou, Y., Carbonetto, P., Wang, G., & Stephens, M. (2022). Fine-mapping from summary data with the “Sum of Single Effects” model. PLOS Genetics, 18(7), e1010299.

Continue reading



SuSiE: Sum of Single Effects Model

January 22, 2024

This note is for Wang, G., Sarkar, A., Carbonetto, P., & Stephens, M. (2020). A Simple New Approach to Variable Selection in Regression, with Application to Genetic Fine Mapping. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82(5), 1273–1300.

Continue reading



Statistical Learning and Selective Inference

January 19, 2024

This post is for Taylor, J., & Tibshirani, R. J. (2015). Statistical learning and selective inference. Proceedings of the National Academy of Sciences of the United States of America, 112(25), 7629–7634.

Continue reading



Exact Post-Selection Inference for Sequential Regression Procedures

January 19, 2024

This post is for Tibshirani, R. J., Taylor, J., Lockhart, R., & Tibshirani, R. (2016). Exact Post-Selection Inference for Sequential Regression Procedures. Journal of the American Statistical Association, 111(514), 600–620.

Continue reading



FDR Control in GLM

January 15, 2024 (Update: )

This post is for Dai, C., Lin, B., Xing, X., & Liu, J. S. (2023). A Scale-Free Approach for False Discovery Rate Control in Generalized Linear Models. Journal of the American Statistical Association, 118(543), 1551–1565.

Continue reading



MMRM: Mixed-Models for Repeated Measures

January 06, 2024 (Update: )

This post is based on vignettes of MMRM R package: https://openpharma.github.io/mmrm/main/index.html

Continue reading



One-way Matching with Low Rank

January 06, 2024 (Update: )

This post is for Chen, Shuxiao, Sizun Jiang, Zongming Ma, Garry P. Nolan, and Bokai Zhu. “One-Way Matching of Datasets with Low Rank Signals.” arXiv, October 3, 2022.

Continue reading



CountSplit for scRNA Data

December 08, 2023 (Update: )

The post is for Neufeld, Anna, Lucy L Gao, Joshua Popp, Alexis Battle, and Daniela Witten. “Inference after Latent Variable Estimation for Single-cell RNA Sequencing Data.” Biostatistics, December 13, 2022, kxac047.

Continue reading



Uncertainty of Pseudotime Trajectory

December 04, 2023

This post is for Tenha, Lovemore, and Mingzhou Song. “Statistical Evidence for the Presence of Trajectory in Single-cell Data.” BMC Bioinformatics 23, no. Suppl 8 (August 16, 2022): 340.

Continue reading



ClusterDE: a post-clustering DE method

December 04, 2023

This post is for Song, Dongyuan, Kexin Li, Xinzhou Ge, and Jingyi Jessica Li. “ClusterDE: A Post-Clustering Differential Expression (DE) Method Robust to False-Positive Inflation Caused by Double Dipping,” 2023

Continue reading



Approximation to Log-likelihood of Nonlinear Mixed-effects Model

November 26, 2023

This post is for Pinheiro, José C., and Douglas M. Bates. “Approximations to the Log-Likelihood Function in the Nonlinear Mixed-Effects Model.” Journal of Computational and Graphical Statistics 4, no. 1 (1995): 12–35.

Continue reading



Hierarchical Multi-label Contrastive Learning

November 25, 2023

This post is for Zhang, Shu, Ran Xu, Caiming Xiong, and Chetan Ramaiah. “Use All the Labels: A Hierarchical Multi-Label Contrastive Learning Framework,” 16660–69, 2022.

Continue reading



Hierarchical Multi-Label Classification

November 20, 2023

This post is for two papers on Hierarchical multi-label classification (HMC), which imposes a hierarchy constraint on the classes.

Continue reading



Consistent Probabilities along GO Structure

November 16, 2023

This note is for Obozinski, Guillaume, Gert Lanckriet, Charles Grant, Michael I. Jordan, and William Stafford Noble. “Consistent Probabilistic Outputs for Protein Function Prediction.” Genome Biology 9 Suppl 1, no. Suppl 1 (2008): S6.

Continue reading



scHOT: Investigate higher-order interactions in single-cell data

October 13, 2023

This note is for Ghazanfar, Shila, Yingxin Lin, Xianbin Su, David Ming Lin, Ellis Patrick, Ze-Guang Han, John C. Marioni, and Jean Yee Hwa Yang. “Investigating Higher-Order Interactions in Single-cell Data with scHOT.” Nature Methods 17, no. 8 (August 2020): 799–806.

Continue reading



Constrained Smoothing and Out-of-range Prediction using P-splines

September 22, 2023

This note is for Navarro-García, M., Guerrero, V., & Durban, M. (2023). On constrained smoothing and out-of-range prediction using P-splines: A conic optimization approach. Applied Mathematics and Computation, 441, 127679.

Continue reading



An Iterative Procedure for Shape-constrained Smoothing using Smoothing Splines

September 21, 2023

This note is for Turlach, B. A. (2005). Shape constrained smoothing using smoothing splines. Computational Statistics, 20(1), 81–104.

Continue reading



Shape-Constrained Estimation Using Nonnegative Splines

September 21, 2023

This note is for Papp, D., & Alizadeh, F. (2014). Shape-Constrained Estimation Using Nonnegative Splines. Journal of Computational and Graphical Statistics, 23(1), 211–231.

Continue reading



Fast and Flexible methods for monotone polynomial fitting

September 21, 2023

This note is for Murray, K., Müller, S., & Turlach, B. (2016). Fast and flexible methods for monotone polynomial fitting. Journal of Statistical Computation and Simulation, 86, 1–21.

Continue reading



Confidence Intervals of Smoothed Isotonic Regression

September 21, 2023

This note is for Groeneboom, P., & Jongbloed, G. (2023). Confidence intervals in monotone regression (arXiv:2303.17988). arXiv.

Continue reading



Lamian: Differential Pseudotime Analysis

September 14, 2023 (Update: )

This note is for Hou, W., Ji, Z., Chen, Z., Wherry, E. J., Hicks, S. C., & Ji, H. (2021). A statistical framework for differential pseudotime analysis with multiple single-cell RNA-seq samples (p. 2021.07.10.451910). bioRxiv.

Continue reading



condiments: Trajectory Inference across Multiple Conditions

September 14, 2023

The note is for Van den Berge, K., Roux de Bézieux, H., Street, K., Saelens, W., Cannoodt, R., Saeys, Y., Dudoit, S., & Clement, L. (2020). Trajectory-based differential expression analysis for single-cell sequencing data. Nature Communications, 11(1), Article 1.

Continue reading



In-Context Learning via Transformers

September 14, 2023

This note is for Garg, S., Tsipras, D., Liang, P., & Valiant, G. (2023). What Can Transformers Learn In-Context? A Case Study of Simple Function Classes (arXiv:2208.01066). arXiv.

Continue reading



Six Statistical Senses

August 28, 2023

This note is for Craiu, R. V., Gong, R., & Meng, X.-L. (2023). Six Statistical Senses. Annual Review of Statistics and Its Application, 10(1), 699–725.

Continue reading



tradeSeq: Trajectory-based differential expression analysis for single-cell sequencing data

July 31, 2023

This post is for Van den Berge, K., Roux de Bézieux, H., Street, K., Saelens, W., Cannoodt, R., Saeys, Y., Dudoit, S., & Clement, L. (2020). Trajectory-based differential expression analysis for single-cell sequencing data. Nature Communications, 11(1), Article 1.

Continue reading



PseudotimeDE: Differential Gene Expression along Cell Pseudotime

July 27, 2023

The note is for Song, D., & Li, J. J. (2021). PseudotimeDE: Inference of differential gene expression along cell pseudotime with well-calibrated p-values from single-cell RNA sequencing data. Genome Biology, 22(1), 124.

Continue reading



scMDC: Single-cell Multi-omics Data Clustering Analysis

July 27, 2023

This post is for Lin, X., Tian, T., Wei, Z., & Hakonarson, H. (2022). Clustering of single-cell multi-omics data with a multimodal deep learning method. Nature Communications, 13(1), Article 1.

Continue reading



Benchmarking Algorithms for Gene Regulatory Network Inference

July 14, 2023 (Update: )

This note is for Pratapa, A., Jalihal, A. P., Law, J. N., Bharadwaj, A., & Murali, T. M. (2020). Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nature Methods, 17(2), Article 2.

Continue reading



Cell type-specific and disease-associated eQTL in the human lung

July 13, 2023

This post is for Natri, H. M., Azodi, C. B. D., Peter, L., Taylor, C. J., Chugh, S., Kendle, R., Chung, M., Flaherty, D. K., Matlock, B. K., Calvi, C. L., Blackwell, T. S., Ware, L. B., Bacchetta, M., Walia, R., Shaver, C. M., Kropski, J. A., McCarthy, D. J., & Banovich, N. E. (2023). Cell type-specific and disease-associated eQTL in the human lung (p. 2023.03.17.533161). bioRxiv.

Continue reading



Cluster Analysis of Transcriptomic Datasets of IPF

July 10, 2023

Kraven, L. M., Taylor, A. R., Molyneaux, P. L., Maher, T. M., McDonough, J. E., Mura, M., Yang, I. V., Schwartz, D. A., Huang, Y., Noth, I., Ma, S. F., Yeo, A. J., Fahy, W. A., Jenkins, R. G., & Wain, L. V. (2023). Cluster analysis of transcriptomic datasets to identify endotypes of idiopathic pulmonary fibrosis. Thorax, 78(6), 551–558.

Continue reading



XGBoost for IPF Biomarker

July 10, 2023

This post is for Fanidis, D., Pezoulas, V. C., Fotiadis, D. Ι., & Aidinis, V. (2023). An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers. Computational and Structural Biotechnology Journal, 21, 2305–2315.

Continue reading



Single Cell Generative Pre-trained Transformer

June 30, 2023

This post is for Cui, H., Wang, C., Maan, H., & Wang, B. (2023). scGPT: Towards Building a Foundation Model for Single-cell Multi-omics Using Generative AI (p. 2023.04.30.538439). bioRxiv.

Continue reading



Deep Generative Modeling for Single-cell Transcriptomics

June 29, 2023

The post is for Lopez, R., Regier, J., Cole, M. B., Jordan, M. I., & Yosef, N. (2018). Deep generative modeling for single-cell transcriptomics. Nature Methods, 15(12), Article 12.

Continue reading



Time-varying Group Sparse Additive Model for GWAS

June 11, 2023 (Update: )

This post is for Marchetti-Bowick, M., Yin, J., Howrylak, J. A., & Xing, E. P. (2016). A time-varying group sparse additive model for genome-wide association studies of dynamic complex traits. Bioinformatics, 32(19), 2903–2910.

Continue reading



fGWAS: Dynamic Model for GWAS

June 11, 2023 (Update: )

The note is for Das, K., Li, J., Wang, Z., Tong, C., Fu, G., Li, Y., Xu, M., Ahn, K., Mauger, D., Li, R., & Wu, R. (2011). A dynamic model for genome-wide association studies. Human Genetics, 129(6), 629–639.

Continue reading



GWAS of Longitudinal Trajectories at Biobank Scale

June 11, 2023 (Update: )

This post is for Ko, S., German, C. A., Jensen, A., Shen, J., Wang, A., Mehrotra, D. V., Sun, Y. V., Sinsheimer, J. S., Zhou, H., & Zhou, J. J. (2022). GWAS of longitudinal trajectories at biobank scale. The American Journal of Human Genetics, 109(3), 433–445.

Continue reading



C-index for Time-varying Risk

May 05, 2023

This post is for Gandy, A., & Matcham, T. J. (2022). On concordance indices for models with time-varying risk (arXiv:2208.03213). arXiv.

Continue reading



Age-dependency of PRS for Prostate Cancer

April 21, 2023

This note is for Schaid, D. J., Sinnwell, J. P., Batzler, A., & McDonnell, S. K. (2022). Polygenic risk for prostate cancer: Decreasing relative risk with age but little impact on absolute risk. American Journal of Human Genetics, 109(5), 900–908.

Continue reading



Cox Models with Time-Varying Covariates vs Time-Varying Coefficients

March 28, 2023

This note is for Zhang, Z., Reinikainen, J., Adeleke, K. A., Pieterse, M. E., & Groothuis-Oudshoorn, C. G. M. (2018). Time-varying covariates and coefficients in Cox regression models. Annals of Translational Medicine, 6(7), 121.

Continue reading



See all posts →