WeiYa's Work Yard

A dog, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Predictive Degrees of Freedom

February 10, 2023

This note is for Luan, B., Lee, Y., & Zhu, Y. (2021). Predictive Model Degrees of Freedom in Linear Regression. ArXiv:2106.15682 [Math].

Continue reading



Tutorial on Polygenic Risk Score

January 24, 2023

This note is based on Choi, S. W., Mak, T. S.-H., & O’Reilly, P. F. (2020). Tutorial: A guide to performing polygenic risk score analyses. Nature Protocols, 15(9), Article 9.

Continue reading



Similarity Network Fusion

December 28, 2022 (Update: )

This post is for Wang, B., Mezlini, A. M., Demir, F., Fiume, M., Tu, Z., Brudno, M., Haibe-Kains, B., & Goldenberg, A. (2014). Similarity network fusion for aggregating data types on a genomic scale. Nature Methods, 11(3), Article 3. and a related paper Ruan, P., Wang, Y., Shen, R., & Wang, S. (2019). Using association signal annotations to boost similarity network fusion. Bioinformatics, 35(19), 3718–3726.

Continue reading



LD Score Regression

December 15, 2022 (Update: )

This note is for Bulik-Sullivan, B. K., Loh, P.-R., Finucane, H. K., Ripke, S., Yang, J., Patterson, N., Daly, M. J., Price, A. L., & Neale, B. M. (2015). LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nature Genetics, 47(3), 291–295.

Continue reading



First Glance at KEGGgraph

November 21, 2022

This post is based on

Continue reading



Joint Local False Discovery Rate in GWAS

November 12, 2022 (Update: )

This note is for Jiang, W., & Yu, W. (2017). Controlling the joint local false discovery rate is more powerful than meta-analysis methods in joint analysis of summary statistics from multiple genome-wide association studies. Bioinformatics, 33(4), 500–507.

Continue reading



Differentiable Sorting and Ranking

November 04, 2022 (Update: )

This note is for Blondel, M., Teboul, O., Berthet, Q., & Djolonga, J. (2020). Fast Differentiable Sorting and Ranking (arXiv:2002.08871). arXiv.

Continue reading



Joint Bayesian Variable and DAG Selection

October 31, 2022

This note is for Cao, X., & Lee, K. (2021). Joint Bayesian Variable and DAG Selection Consistency for High-dimensional Regression Models with Network-structured Covariates. Statistica Sinica.

Continue reading



Integrative Bayesian Analysis of High-dimensional Multiplatform Genomics Data

October 30, 2022

This note is for Wang, W., Baladandayuthapani, V., Morris, J. S., Broom, B. M., Manyam, G., & Do, K.-A. (2013). iBAG: Integrative Bayesian analysis of high-dimensional multiplatform genomics data. Bioinformatics, 29(2), 149–159.

Continue reading



Bayesian Hierarchical Varying-Sparsity Regression Models with Application to Cancer Proteogenomics.

October 29, 2022

This note is for Ni, Y., Stingo, F. C., Ha, M. J., Akbani, R., & Baladandayuthapani, V. (2019). Bayesian Hierarchical Varying-Sparsity Regression Models with Application to Cancer Proteogenomics. Journal of the American Statistical Association, 114(525), 48–60.

Continue reading



Simultaneous Estimation of Cell Type Proportions and Cell Type-specific Gene Expressions

October 12, 2022

This note is for Tang, D., Park, S., & Zhao, H. (2022). SCADIE: Simultaneous estimation of cell type proportions and cell type-specific gene expressions using SCAD-based iterative estimating procedure. Genome Biology, 23(1), 129.

Continue reading



scDesign3: A Single-cell Simulator

October 10, 2022

This note is based on Jingyi Jessica Li’s talk on Song, D., Wang, Q., Yan, G., Liu, T., & Li, J. J. (2022). A unified framework of realistic in silico data generation and statistical model inference for single-cell and spatial omics (p. 2022.09.20.508796). bioRxiv.

Continue reading



ADAM and AMSGrad for Stochastic Optimization

October 09, 2022

This post is based on

Continue reading



Single-cell Graph Neural Network

October 08, 2022

This note is for Prof. Dong Xu’s talk on Wang, J., Ma, A., Chang, Y., Gong, J., Jiang, Y., Qi, R., Wang, C., Fu, H., Ma, Q., & Xu, D. (2021). ScGNN is a novel graph neural network framework for single-cell RNA-Seq analyses. Nature Communications, 12(1), Article 1.

Continue reading



Contrastive Learning: A Simple Framework and A Theoretical Analysis

October 06, 2022

This note is based on

Continue reading



Joint Model of Longitudinal and Survival Data

October 02, 2022 (Update: )

This post is based on Rizopoulos, D. (2017). An Introduction to the Joint Modeling of Longitudinal and Survival Data, with Applications in R. 235.

Continue reading



Debiased Inverse-Variance Weighted Estimator in Mendelian Randomization

September 20, 2022

This post is for the talk at Yale given by Prof. Ting Ye based on the paper Ye, T., Shao, J., & Kang, H. (2020). Debiased Inverse-Variance Weighted Estimator in Two-Sample Summary-Data Mendelian Randomization (arXiv:1911.09802). arXiv.

Continue reading



Multicenter IPF-PRO Registry Cohort

August 25, 2022 (Update: )

This note is for Todd, J. L., Vinisko, R., Liu, Y., Neely, M. L., Overton, R., Flaherty, K. R., Noth, I., Newby, L. K., Lasky, J. A., Olman, M. A., Hesslinger, C., Leonard, T. B., Palmer, S. M., & Belperio, J. A. (2020). Circulating matrix metalloproteinases and tissue metalloproteinase inhibitors in patients with idiopathic pulmonary fibrosis in the multicenter IPF-PRO Registry cohort. BMC Pulmonary Medicine, 20(1), 64.

Continue reading



Fitting to Future Observations

July 21, 2022

This note is for Jiang, Y., & Liu, C. (2022). Estimation of Over-parameterized Models via Fitting to Future Observations (arXiv:2206.01824). arXiv.

Continue reading



Machine Learning for Multi-omics Data

July 15, 2022 (Update: )

This note is based on Cai, Z., Poulos, R. C., Liu, J., & Zhong, Q. (2022). Machine learning for multi-omics data integration in cancer. IScience, 25(2), 103798.

Continue reading



Review on Multi-omics Data

July 14, 2022

This note is based on Subramanian, I., Verma, S., Kumar, S., Jere, A., & Anamika, K. (2020). Multi-omics Data Integration, Interpretation, and Its Application. Bioinformatics and Biology Insights, 14, 1177932219899051.

Continue reading



Monotone Multi-Layer Perceptron

July 04, 2022

This note is for monotonic Multi-Layer Perceptron Neural network, and the references are from the R package monmlp.

Continue reading



Test of Monotonicity by Calibrating for Linear Functions

May 11, 2022 (Update: )

This note is for Hall, P., & Heckman, N. E. (2000). Testing for Monotonicity of a Regression Mean by Calibrating for Linear Functions. The Annals of Statistics, 28(1), 20–39.

Continue reading



Test of Monotonicity and Convexity by Splines

April 23, 2022

This note is for Wang, J. C., & Meyer, M. C. (2011). Testing the monotonicity or convexity of a function using regression splines. The Canadian Journal of Statistics / La Revue Canadienne de Statistique, 39(1), 89–107.

Continue reading



Test of Monotonicity by U-processes

April 23, 2022

This note is for Ghosal, S., Sen, A., & van der Vaart, A. W. (2000). Testing Monotonicity of Regression. The Annals of Statistics, 28(4), 1054–1082.

Continue reading



Monotonicity in Asset Returns

April 20, 2022

This note is for Patton, A. J., & Timmermann, A. (2010). Monotonicity in asset returns: New tests with applications to the term structure, the CAPM, and portfolio sorts. Journal of Financial Economics, 98(3), 605–625.

Continue reading



Test of Monotonicity

April 20, 2022

This note is for Chetverikov, D. (2019). TESTING REGRESSION MONOTONICITY IN ECONOMETRIC MODELS. Econometric Theory, 35(4), 729–776.

Continue reading



Big Data Paradox

April 07, 2022

This note is for Meng, X.-L. (2018). Statistical paradises and paradoxes in big data (I): Law of large populations, big data paradox, and the 2016 US presidential election. The Annals of Applied Statistics, 12(2).

Continue reading



Adaptive Ridge Estimate

March 30, 2022

This note is for Grandvalet, Y. (1998). Least Absolute Shrinkage is Equivalent to Quadratic Penalization. In L. Niklasson, M. Bodén, & T. Ziemke (Eds.), ICANN 98 (pp. 201–206). Springer London.

Continue reading



Mixture of Location-Scale Families

March 25, 2022

This note is for Chen, J., Li, P., & Liu, G. (2020). Homogeneity testing under finite location-scale mixtures. Canadian Journal of Statistics, 48(4), 670–684.

Continue reading



Scale Mixture Models

March 25, 2022

This note is for scale mixture models.

Continue reading



Prediction Risk for the Horseshoe Regression

March 24, 2022

The note is for Bhadra, A., Datta, J., Li, Y., Polson, N. G., & Willard, B. (2019). Prediction Risk for the Horseshoe Regression. 39.

Continue reading



Estimation of Location and Scale Parameters of Continuous Density

March 22, 2022 (Update: )

This note is for Pitman, E. J. G. (1939). The Estimation of the Location and Scale Parameters of a Continuous Population of any Given Form. Biometrika, 30(3/4), 391–421. and Kagan, AM & Rukhin, AL. (1967). On the estimation of a scale parameter. Theory of Probability \& Its Applications, 12, 672–678.

Continue reading



Equivariance

March 22, 2022

This post is for Chapter 3 of Lehmann, E. L., & Casella, G. (1998). Theory of point estimation (2nd ed). Springer.

Continue reading



Applications with Scale Parameters

March 22, 2022

This note contains several papers related to scale parameter.

Continue reading



Leave-one-out CV for Lasso

March 14, 2022

This note is for Homrighausen, D., & McDonald, D. J. (2013). Leave-one-out cross-validation is risk consistent for lasso. ArXiv:1206.6128 [Math, Stat].

Continue reading



Neuronized Priors for Bayesian Sparse Linear Regression

January 16, 2022

This note is for Shin, M., & Liu, J. S. (2021). Neuronized Priors for Bayesian Sparse Linear Regression. Journal of the American Statistical Association, 1–16.

Continue reading



Empirical Bayes

January 16, 2022

This note is based on Sec. 4.6 of Lehmann, E. L., & Casella, G. (1998). Theory of point estimation (2nd ed). Springer.

Continue reading



Magnetic Field Orientations in Star Formation

January 12, 2022

Continue reading



Generalizing Ridge Regression

December 14, 2021

This note is for Chapter 3 of van Wieringen, W. N. (2021). Lecture notes on ridge regression. ArXiv:1509.09169 [Stat].

Continue reading



Gaussian Processes for Regression

December 13, 2021

This note is for Chapter 4 of Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. MIT Press.

Continue reading



Additive Model with Linear Smoother

December 07, 2021

This note is for Buja, A., Hastie, T., & Tibshirani, R. (1989). Linear Smoothers and Additive Models. The Annals of Statistics, 17(2), 453–510. JSTOR.

Continue reading



Asymptotics of Cross Validation

December 03, 2021

This note is for Austern, M., & Zhou, W. (2020). Asymptotics of Cross-Validation. ArXiv:2001.11111 [Math, Stat].

Continue reading



Review on Random Matrix Theory

December 01, 2021

This note is for Paul, D., & Aue, A. (2014). Random matrix theory in statistics: A review. Journal of Statistical Planning and Inference, 150, 1–29.

Continue reading



Probabilistic Principal Curves

November 22, 2021

This note is for Chang, K.-Y., & Ghosh, J. (2001). A unified model for probabilistic principal surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(1), 22–41., but only involves the principal curves.

Continue reading



Regularization-Free Principal Curves

November 21, 2021

The note is for Gerber, S., & Whitaker, R. (2013). Regularization-Free Principal Curve Estimation. 18.

Continue reading



Invariant Risk Minimization

November 19, 2021 0 Comments

This note is for Arjovsky, M., Bottou, L., Gulrajani, I., & Lopez-Paz, D. (2020). Invariant Risk Minimization. ArXiv:1907.02893 [Cs, Stat].

Continue reading



Causal Inference by Invariant Prediction

November 19, 2021 0 Comments

This note is for Peters, J., Bühlmann, P., & Meinshausen, N. (2016). Causal inference by using invariant prediction: Identification and confidence intervals. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(5), 947–1012.

Continue reading



Infinite Relational Model

November 18, 2021 (Update: ) 0 Comments

This note is based on Kemp, C., Tenenbaum, J. B., Griffiths, T. L., Yamada, T., & Ueda, N. (n.d.). Learning Systems of Concepts with an Infinite Relational Model. 8. and Saad, F. A., & Mansinghka, V. K. (2021). Hierarchical Infinite Relational Model. ArXiv:2108.07208 [Cs, Stat].

Continue reading



Multidimensional Monotone Bayesian Additive Regression Tree

November 17, 2021 0 Comments

This note is for Chipman, H. A., George, E. I., McCulloch, R. E., & Shively, T. S. (2021). mBART: Multidimensional Monotone BART. ArXiv:1612.01619 [Stat].

Continue reading



See all posts →