WeiYa's Work Yard

A dog, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

CFPCA for Human Movement Data

April 26, 2020 (Update: )

This post is based on Coffey, N., Harrison, A. J., Donoghue, O. A., & Hayes, K. (2011). Common functional principal components analysis: A new approach to analyzing human movement data. Human Movement Science, 30(6), 1144–1166.

Continue reading



Jackknife and Mutual Information

January 07, 2019 (Update: ) 0 Comments

In this note, the material about Jackknife is based on Wasserman (2006) and Efron and Hastie (2016), while the Jackknife estimation of Mutual Information is based on Zeng et al. (2018).

Continue reading



Common Functional Principal Components

February 29, 2020 (Update: )

This post is based on Benko, M., Härdle, W., & Kneip, A. (2009). Common functional principal components. The Annals of Statistics, 37(1), 1–34.

Continue reading



Equicorrelation Matrix

February 22, 2020 (Update: )

kjytay’s blog summarizes some properties of equicorrelation matix, which has the following form,

Continue reading



Exponential Twisting in Importance Sampling

September 18, 2019 (Update: )

This note is based on Ma, J., Du, K., & Gu, G. (2019). An efficient exponential twisting importance sampling technique for pricing financial derivatives. Communications in Statistics - Theory and Methods, 48(2), 203–219.

Continue reading



Generalized Matrix Decomposition

January 17, 2020 (Update: )

This post is based on the talk given by Dr. Yue Wang at the Department of Statistics and Data Science, Southern University of Science and Technology on Jan. 04, 2020.

Continue reading



Statistical Inference with Unnormalized Models

February 10, 2020 (Update: )

This post is based on the talk given by T. Kanamori at the 11th ICSA International Conference on Dec. 22nd, 2019.

Continue reading



Tweedie's Formula and Selection Bias

March 11, 2019 (Update: )

Prof. Inchi HU will give a talk on Large Scale Inference for Chi-squared Data tomorrow, which proposes the Tweedie’s formula in the Bayesian hierarchical model for chi-squared data, and he mentioned a thought-provoking paper, Efron, B. (2011). Tweedie’s Formula and Selection Bias. Journal of the American Statistical Association, 106(496), 1602–1614., which is the focus of this note.

Continue reading



Gradient-based Sparse Principal Component Analysis

January 05, 2020 (Update: )

This post is based on the talk, Gradient-based Sparse Principal Component Analysis, given by Dr. Yixuan Qiu at the Department of Statistics and Data Science, Southern University of Science and Technology on Jan. 05, 2020.

Continue reading



Quantitative Genetics

December 21, 2019 (Update: )

This post is based on the Pao-Lu Hsu Award Lecture given by Prof. Hongyu Zhao at the 11th ICSA International Conference on Dec. 21th, 2019.

Continue reading



Registration Problem in Functional Data Analysis

January 21, 2020 (Update: )

This post is based on the seminar, Data Acquisition, Registration and Modelling for Multi-dimensional Functional Data, given by Prof. Shi.

Continue reading



Rademacher Complexity

January 16, 2020 (Update: )

This post is based on the material of the second lecture of STAT 6050 instructed by Prof. Wicker, and mainly refer some more formally description from the book, Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar - Foundations of Machine Learning-The MIT Press (2012).

Continue reading



CEASE

December 20, 2019 (Update: )

This post is based on the Peter Hall Lecture given by Prof. Jianqing Fan at the 11th ICSA International Conference on Dec. 20th, 2019.

Continue reading



Theoretical Results of Lasso

March 26, 2019 (Update: )

Prof. Jon A. WELLNER introduced the application of a new multiplier inequality on lasso in the distinguish lecture, which reminds me that it is necessary to read more theoretical results of lasso, and so this is the post, which is based on Hastie, T., Tibshirani, R., & Wainwright, M. (2015). Statistical Learning with Sparsity. 362.

Continue reading



NGS for NGS

January 11, 2020 (Update: )

This post is based on the talk, Next-Generation Statistical Methods for Association Analysis of Now-Generation Sequencing Studies, given by Dr. Xiang Zhan at the Department of Statistics and Data Science, Southern University of Science and Technology on Jan. 05, 2020.

Continue reading



Rare Variant Association Testing

July 18, 2019 (Update: )

This note is based on

Continue reading



Group Inference in High Dimensions

December 17, 2019 (Update: )

This post is based on the slides for the talk given by Zijian Guo at The International Statistical Conference In Memory of Professor Sik-Yum Lee

Continue reading



Gibbs Sampler for Finding Motif

December 10, 2018 (Update: )

This post is the online version of my report for the Project 2 of STAT 5050 taught by Prof. Wei.

Continue reading



A Stochastic Model for Evolution of Metabolic Network

August 07, 2018 (Update: )

This post is the notes for Mithani et al. (2009).

Continue reading



Controlling bias and inflation in EWAS/TWAS

December 04, 2019 (Update: )

The post is based on the BIOS Consortium, van Iterson, M., van Zwet, E. W., & Heijmans, B. T. (2017). Controlling bias and inflation in epigenome- and transcriptome-wide association studies using the empirical null distribution. Genome Biology, 18(1), 19.

Continue reading



Multivariate Mediation Effects

November 04, 2019 (Update: )

This note is based on Huang, Y.-T. (2019). Variance component tests of multivariate mediation effects under composite null hypotheses. Biometrics, 0(0).

Continue reading



Union-intersection tests and Intersection-union tests

December 02, 2019 (Update: )

This post is based on section 8.3 of Casella and Berger (2001).

Continue reading



Generalized Functional Linear Models with Semiparametric Single-index Interactions

October 29, 2019 (Update: )

This post is based on Li, Y., Wang, N., & Carroll, R. J. (2010). Generalized Functional Linear Models With Semiparametric Single-Index Interactions. Journal of the American Statistical Association, 105(490), 621–633.

Continue reading



Gaussian DAGs on Network Data

November 19, 2019 (Update: )

This post is based on Li, H., & Zhou, Q. (2019). Gaussian DAGs on network data. ArXiv:1905.10848 [Cs, Stat].

Continue reading



Optimal estimation of functionals of high-dimensional mean and covariance matrix

August 26, 2019 (Update: )

This post is based on Fan, J., Weng, H., & Zhou, Y. (2019). Optimal estimation of functionals of high-dimensional mean and covariance matrix. ArXiv:1908.07460 [Math, Stat].

Continue reading



SIR and Its Implementation

January 05, 2019 (Update: ) 0 Comments

Continue reading



Link-free v.s. Semiparametric

January 08, 2019 (Update: )

This note is based on Li (1991) and Ma and Zhu (2012).

Continue reading



Sparse LDA

September 17, 2019 (Update: )

This note is based on Shao, J., Wang, Y., Deng, X., & Wang, S. (2011). Sparse linear discriminant analysis by thresholding for high dimensional data. The Annals of Statistics, 39(2), 1241–1265.

Continue reading



Feature Annealed Independent Rules

September 17, 2019 (Update: ) 0 Comments

This note is based on Fan, J., & Fan, Y. (2008). High-dimensional classification using features annealed independence rules. The Annals of Statistics, 36(6), 2605–2637.

Continue reading



Dantzig Selector

August 16, 2019 (Update: )

This post is based on Candes, E., & Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. The Annals of Statistics, 35(6), 2313–2351.

Continue reading



MLE for MTP2

July 05, 2019 (Update: )

This post is based on Lauritzen, S., Uhler, C., & Zwiernik, P. (2019). Maximum likelihood estimation in Gaussian models under total positivity. The Annals of Statistics, 47(4), 1835–1863.

Continue reading



TreeClone

July 08, 2019 (Update: )

This note is based on Zhou, T., Sengupta, S., Müller, P., & Ji, Y. (2019). TreeClone: Reconstruction of tumor subclone phylogeny based on mutation pairs using next generation sequencing data. The Annals of Applied Statistics, 13(2), 874–899.

Continue reading



Minimax Lower Bounds

June 28, 2019 (Update: )

This note is based on Chapter 15 of Wainwright, M. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint (Cambridge Series in Statistical and Probabilistic Mathematics). Cambridge: Cambridge University Press.

Continue reading



Change Points

May 28, 2019 (Update: )

Continue reading



Fourier Series

May 07, 2019 (Update: )

Continue reading



M-estimator

May 09, 2019 (Update: )

Continue reading



Particle Filtering and Smoothing

January 18, 2019 (Update: ) 0 Comments

This note is for Doucet, A., & Johansen, A. M. (2009). A tutorial on particle filtering and smoothing: Fifteen years later. Handbook of Nonlinear Filtering, 12(656–704), 3. For the sake of clarity, I split the general SMC methods (section 3) into my next post.

Continue reading



Generalized Gradient Descent

March 20, 2019 (Update: )

I read the topic in kiytay’s blog: Proximal operators and generalized gradient descent, and then read its reference, Hastie et al. (2015), and write some program to get a better understanding.

Continue reading



Multiple Object Tracking

March 26, 2019 (Update: )

This note is for Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., Zhao, X., & Kim, T.-K. (2014). Multiple Object Tracking: A Literature Review. ArXiv:1409.7618 [Cs].

Continue reading



The Gibbs Sampler

June 04, 2017 (Update: ) 0 Comments

Gibbs sampler is an iterative algorithm that constructs a dependent sequence of parameter values whose distribution converges to the target joint posterior distribution.

Continue reading



Tensor Completion

March 07, 2019 (Update: )

Prof. YUAN Ming will give a distinguish lecture on Low Rank Tensor Methods in High Dimensional Data Analysis. To get familiar with his work on tensor, I read his paper, Yuan, M., & Zhang, C.-H. (2016). On Tensor Completion via Nuclear Norm Minimization. Foundations of Computational Mathematics, 16(4), 1031–1068., which is the topic of this post.

Continue reading



SMC for Protein Folding Problem

February 23, 2019 (Update: )

This note is based on Wong, S. W. K., Liu, J. S., & Kou, S. C. (2018). Exploring the conformational space for protein folding with sequential Monte Carlo. The Annals of Applied Statistics, 12(3), 1628–1654.

Continue reading



Select Prior by Formal Rules

March 04, 2019 (Update: )

Larry wrote that “Noninformative priors are a lost cause” in his post, LOST CAUSES IN STATISTICS II: Noninformative Priors, and he mentioned his review paper Kass and Wasserman (1996) on noninformative priors. This note is for this paper.

Continue reading



Bio-chemical Reaction Networks

February 25, 2019 (Update: )

This note is based on Loskot, P., Atitey, K., & Mihaylova, L. (2019). Comprehensive review of models and methods for inferences in bio-chemical reaction networks.

Continue reading



An Illustration of Importance Sampling

July 16, 2017 (Update: ) 0 Comments

This report shows how to use importance sampling to estimate the expectation.

Continue reading



Sequential Monte Carlo Methods

June 10, 2017 (Update: ) 0 Comments

The first peep to SMC as an abecedarian, a more comprehensive note can be found here.

Continue reading



Chain-Structured Models

September 08, 2017 (Update: ) 0 Comments

There is an important probability distribution used in many applications, the chain-structured model.

Continue reading



The Applications of Monte Carlo

September 07, 2017 (Update: ) 0 Comments

Continue reading



Growing A Polymer

July 17, 2017 (Update: ) 0 Comments

This report implements the simulation of growing a polymer under the self-avoid walk model, and summary the sequential importance sampling techniques for this problem.

Continue reading



Genetic network inference

March 14, 2017 0 Comments

There are my notes when I read the paper called Genetic network inference.

Continue reading



Systems Genetic Approach

March 16, 2017 0 Comments

There are my notes when I read the paper called System Genetic Approach.

Continue reading



MICA

March 17, 2017 0 Comments

There are my notes when I read the paper called Maximal information component analysis.

Continue reading



MINE

March 17, 2017 0 Comments

There are my notes when I read the paper called Detecting Novel Associations in Large Data Sets.

Continue reading



Implement of MINE

March 17, 2017 0 Comments

This is the implement in R of MINE.

Continue reading



Ensemble Learning

May 17, 2017 0 Comments

Continue reading



Illustrations of Support Vector Machines

May 18, 2017 0 Comments

Use the e1071 library in R to demonstrate the support vector classifier and the SVM.

Continue reading



One Parameter Models

June 04, 2017 0 Comments

Continue reading



The Normal Model

June 05, 2017 0 Comments

Continue reading



Sequential Monte Carlo samplers

June 11, 2017 0 Comments

This note is for Moral, P. D., Doucet, A., & Jasra, A. (2006). Sequential Monte Carlo samplers. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(3), 411–436.

Continue reading



SMC for Mixture Distribution

June 11, 2017

Continue reading



ARIMA

July 11, 2017 0 Comments

Any time series without a constant mean over time is nonstationary.

Continue reading



Adaptive Importance Sampling

July 16, 2017 0 Comments

Continue reading



Model Specification

July 17, 2017 0 Comments

For a given time series, how to choose appropriate values for $p, d, q$

Continue reading



A Bayesian Missing Data Problem

July 18, 2017 0 Comments

Continue reading



Metropolis Algorithm

July 21, 2017 0 Comments

Monte Carlo plays a key role in evaluating integrals and simulating stochastic systems, and the most critical step of Monte Carlo algorithm is sampling from an appropriate probability distribution $\pi (\mathbf x)$. There are two ways to solve this problem, one is to do importance sampling, another is to produce statistically dependent samples based on the idea of Markov chain Monte Carlo sampling.

Continue reading



SMC in Biological Problems

July 22, 2017 0 Comments

Continue reading



Estimate Parameters in Logistic Regression

July 30, 2017 0 Comments

Continue reading



Poisson Regression

July 31, 2017 0 Comments

Continue reading



Story about P value

August 09, 2017 0 Comments

“The p value was never meant to be used the way it’s used today.” –Goodman

Continue reading



Conjugate Gradient for Regression

August 13, 2017 0 Comments

The conjugate gradient method is an iterative method for solving a linear system of equations, so we can use conjugate method to estimate the parameters in (linear/ridge) regression.

Continue reading



Cox Regression

August 17, 2017 0 Comments

Survival analysis examines and models the time it takes for events to occur. It focuses on the distribution of survival times. There are many well known methods for estimating unconditional survival distribution, and they examines the relationship between survival and one or more predictors, usually terms covariates in the survival-analysis literature. And Cox Proportional-Hazards regression model is one of the most widely used method of survival analysis.

Continue reading



Restricted Boltzmann Machines

August 26, 2017 0 Comments

Continue reading



Dynamics of Helicobacter pylori colonization

August 31, 2017 0 Comments

This post is the notes of this paper.

Continue reading



Healthy Human Microbiome

September 01, 2017 0 Comments

This post is for The Human Microbiome Project Consortium, Huttenhower, C., Gevers, D., Knight, R., Abubucker, S., Badger, J. H., … White, O. (2012). Structure, function and diversity of the healthy human microbiome. Nature, 486(7402), 207–214.

Continue reading



Dynamics of Helicobacter pylori Infection

September 01, 2017 0 Comments

The note is for Kirschner, D. E., & Blaser, M. J. (1995). The dynamics of helicobacter pylori infection of the human stomach. Journal of Theoretical Biology, 176(2), 281–290.

Continue reading



Basic Principles of Monte Carlo

September 07, 2017 0 Comments

Continue reading



Persistence of species in the face of environmental stochasticity

September 18, 2017 0 Comments

Sebastian Schreiber gave a talk titled Persistence of species in the face of environmental stochasticity.

Continue reading



A Faster Algorithm for Repeated Linear Regression

September 21, 2017 0 Comments

Repeated Linear Regression means that repeat the fitting of linear regression for many times, and there are some common parts among these regressions.

Continue reading



An R Package: Fit Repeated Linear Regressions

September 26, 2017 0 Comments

Repeated Linear Regressions refer to a set of linear regressions in which there are several same variables.

Continue reading



Stochastic Epidemic Models

October 11, 2017 0 Comments

Discuss three different methods for formulating stochastic epidemic models.

Continue reading



Essentials of Survival Time Analysis

October 11, 2017 0 Comments

This post aims to clarify the relationship between rates and probabilities.

Continue reading



Model-Free Scoring System for Risk Prediction

October 17, 2017 0 Comments

Continue reading



Power Analysis

December 27, 2017 0 Comments

Continue reading



ECOC

August 18, 2018

The note is for Dietterich, T. and Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes, Journal of Artificial Intelligence Research 2: 263–286..

Continue reading



Gibbs in genetics

August 24, 2018

The note is for Gilks, W. R., Richardson, S., & Spiegelhalter, D. (Eds.). (1995). Markov chain Monte Carlo in practice. CRC press..

Continue reading



Evolutionary Systems Biology

December 30, 2018

The note is for Chapter 1 of Soyer, Orkun S., ed. 2012 Evolutionary Systems Biology. Advances in Experimental Medicine and Biology, 751. New York: Springer.

Continue reading



Metabolic Network and Their Evolution

December 31, 2018

The note is for Chapter 2 of Soyer, Orkun S., ed. 2012 Evolutionary Systems Biology. Advances in Experimental Medicine and Biology, 751. New York: Springer.

Continue reading



Small World inside Large Metabolic Networks

January 02, 2019

The note is for Wagner, A., & Fell, D. A. (2001). The small world inside large metabolic networks. Proceedings of the Royal Society of London B: Biological Sciences, 268(1478), 1803-1810..

Continue reading



Counting Process Based Dimension Reduction Methods for Censored Data

January 06, 2019

The note is for Sun, Q., Zhu, R., Wang, T., & Zeng, D. (2017). Counting Process Based Dimension Reduction Methods for Censored Outcomes. ArXiv:1704.05046 [Stat].

Continue reading



Reconstruct Gaussian DAG

January 09, 2019

This note is based on Yuan, Y., Shen, X., Pan, W., & Wang, Z. (n.d.). Constrained likelihood for reconstructing a directed acyclic Gaussian graph. Biometrika.

Continue reading



Reversible jump Markov chain Monte Carlo

January 10, 2019

The note is for Green, P.J. (1995). “Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination”. Biometrika. 82 (4): 711–732.

Continue reading



Approximate $\ell_0$-penalized piecewise-constant estimate of graphs

January 13, 2019

This note is for Fan, Z., & Guan, L. (2018). Approximate $\ell_{0}$-penalized estimation of piecewise-constant signals on graphs. The Annals of Statistics, 46(6B), 3217–3245.

Continue reading



PLS in High-Dimensional Regression

January 15, 2019

This note is based on Cook, R. D., & Forzani, L. (2019). Partial least squares prediction in high-dimensional regression. The Annals of Statistics, 47(2), 884–908.

Continue reading



Sequential Monte Carlo Methods

January 19, 2019

This note is for Section 3 of Doucet, A., & Johansen, A. M. (2009). A tutorial on particle filtering and smoothing: Fifteen years later. Handbook of Nonlinear Filtering, 12(656–704), 3., and it is the complement of my previous post.

Continue reading



The Kalman Filter and Extended Kalman Filter

January 21, 2019

Continue reading



Annealed SMC for Bayesian Phylogenetics

January 24, 2019

This note is for Wang, L., Wang, S., & Bouchard-Côté, A. (2018). An Annealed Sequential Monte Carlo Method for Bayesian Phylogenetics. ArXiv:1806.08813 [q-Bio, Stat].

Continue reading



Annealed Importance Sampling

January 28, 2019

This is the note for Neal, R. M. (1998). Annealed Importance Sampling. ArXiv:Physics/9803008.

Continue reading



Calculating Marginal likelihood

January 30, 2019

The note is for Fourment, M., Magee, A. F., Whidden, C., Bilge, A., Matsen IV, F. A., & Minin, V. N. (2018). 19 dubious ways to compute the marginal likelihood of a phylogenetic tree topology.

Continue reading



The First Glimpse into Pseudolikelihood

February 12, 2019

This post caught a glimpse of the pseudolikelihood.

Continue reading



Comparisons of Three Likelihood Criteria

February 12, 2019

The note is for Nelder, J. A., & Lee, Y. (1992). Likelihood, Quasi-Likelihood and Pseudolikelihood: Some Comparisons. Journal of the Royal Statistical Society. Series B (Methodological), 54(1), 273–284.

Continue reading



Identification of PE Genes in Cell Cycle

February 13, 2019

This note is based on Fan, X., Pyne, S., & Liu, J. S. (2010). Bayesian meta-analysis for identifying periodically expressed genes in fission yeast cell cycle. The Annals of Applied Statistics, 4(2), 988–1013.

Continue reading



Gibbs Sampling for the Multivariate Normal

February 13, 2019

This note is based on Chapter 7 of Hoff PD. A first course in Bayesian statistical methods. Springer Science & Business Media; 2009 Jun 2.

Continue reading



Review of Composite Likelihood

February 13, 2019

This note is based on Varin, C., Reid, N., & Firth, D. (2011). AN OVERVIEW OF COMPOSITE LIKELIHOOD METHODS. Statistica Sinica, 21(1), 5–42., a survey of recent developments in the theory and application of composite likelihood.

Continue reading



Studentized U-statistic

February 15, 2019 0 Comments

In Prof. Shao’s wonderful talk, Wandering around the Asymptotic Theory, he mentioned the Studentized U-statistics. I am interested in the derivation of the variances in the denominator.

Continue reading



Deep Learning

February 16, 2019

This note is based on LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444.

Continue reading



A Bayesian Perspective of Deep Learning

February 17, 2019

This note is for Polson, N. G., & Sokolov, V. (2017). Deep Learning: A Bayesian Perspective. Bayesian Analysis, 12(4), 1275–1304.

Continue reading



Presistency

February 18, 2019

The paper, Greenshtein and Ritov (2004), is recommended by Larry Wasserman in his post Consistency, Sparsistency and Presistency.

Continue reading



Restricted Isometry Property

February 19, 2019

I encounter the term RIP in Larry Wasserman’s post, RIP RIP (Restricted Isometry Property, Rest In Peace), and also find some material in Hastie et al.’s book: Statistical Learning with Sparsity about RIP.

Continue reading



Continuous Time Markov Chain

February 20, 2019

This note is based on Karl Sigman’s IEOR 6711: Continuous-Time Markov Chains.

Continue reading



Stein's Paradox

February 21, 2019

I learned Stein’s Paradox from Larry Wasserman’s post, STEIN’S PARADOX, perhaps I had encountered this term before but I cannot recall anything about it. (I am guilty)

Continue reading



Bootstrap Hypothesis Testing

March 03, 2019 0 Comments

This report is motivated by comments under Larry’s post, Modern Two-Sample Tests.

Continue reading



Illustrate Path Sampling by Stan Programming

March 06, 2019 0 Comments

This post reviewed the topic of path sampling in the lecture slides of STAT 5020, and noted a general path sampling described by Gelman and Meng (1998), then used a toy example to illustrate it with Stan programming language.

Continue reading



Evaluate Variational Inference

March 07, 2019

A brief summary of the post, Eid ma clack shaw zupoven del ba.

Continue reading



Bernstein Bounds

March 08, 2019

I noticed that the papers of matrix/tensor completion always talk about the Bernstein inequality, then I picked the Bernstein Bounds discussed in Wainwright (2019).

Continue reading



The Correlated Topic Model

March 12, 2019

This note is for Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of Science. The Annals of Applied Statistics, 1(1), 17–35.

Continue reading



Distributed inference for quantile regression processes

March 13, 2019

This note is for Volgushev, S., Chao, S.-K., & Cheng, G. (2019). Distributed inference for quantile regression processes. The Annals of Statistics, 47(3), 1634–1662.

Continue reading



Functional Data Analysis

March 14, 2019

Continue reading



Functional Data Analysis by Matrix Completion

March 15, 2019

Continue reading



High Dimensional Covariance Matrix Estimation

March 19, 2019

Continue reading



Convergence rates of least squares

March 25, 2019

This note is for Han, Q., & Wellner, J. A. (2017). Convergence rates of least squares regression estimators with heavy-tailed errors.

Continue reading



Joint Summarized by Marginal or Conditional?

March 25, 2019

I happened to read Yixuan’s blog about a question related to the course Statistical Inference, whether two marginal distributions can determine the joint distribution. The question is adopted from Exercise 4.47 of Casella and Berger (2002).

Continue reading



FARM-Test

March 29, 2019

This note is for Fan, J., Ke, Y., Sun, Q., & Zhou, W.-X. (2017). FarmTest: Factor-Adjusted Robust Multiple Testing with Approximate False Discovery Control. ArXiv:1711.05386 [Stat]..

Continue reading



Frequentist Accuracy of Bayesian Estimates

March 31, 2019

This note is for Efron’s slide: Frequentist Accuracy of Bayesian Estimates, which is recommended by Larry’s post: Shaking the Bayesian Machine.

Continue reading



Soft Imputation in Matrix Completion

April 01, 2019

This post is based on Chapter 7 of Statistical Learning with Sparsity: The Lasso and Generalizations, and I wrote R program to reproduce the simulations to get a better understanding.

Continue reading



Coupled Minimum-Cost Flow Cell Tracking

April 02, 2019

This note is for Padfield, D., Rittscher, J., & Roysam, B. (2011). Coupled minimum-cost flow cell tracking for high-throughput quantitative analysis. Medical Image Analysis, 15(4), 650–668..

Continue reading



Wierd Things in Mixture Models

April 04, 2019

This note is based on Larry’s post, Mixture Models: The Twilight Zone of Statistics.

Continue reading



Subgradient

April 08, 2019

This post is mainly based on Hastie et al. (2015), and incorporated with some materials from Watson (1992).

Continue reading



Tracking Multiple Interacting Targets via MCMC-MRF

April 09, 2019

This note is for Khan, Z., Balch, T., & Dellaert, F. (2004). An MCMC-Based Particle Filter for Tracking Multiple Interacting Targets. In T. Pajdla & J. Matas (Eds.), Computer Vision - ECCV 2004 (pp. 279–290). Springer Berlin Heidelberg.

Continue reading



Methods for Cell Tracking

April 09, 2019

This post is for the survey paper, Meijering, E., Dzyubachyk, O., & Smal, I. (2012). Chapter nine - Methods for Cell and Particle Tracking. In P. M. conn (Ed.), Methods in Enzymology (pp. 183–200).

Continue reading



Normalizing Constant

April 10, 2019

Larry discussed the normalizing constant paradox in his blog.

Continue reading



Multiple Tracking with Rao-Blackwellized marginal particle filtering

April 10, 2019

This note is for Smal, I., Meijering, E., Draegestein, K., Galjart, N., Grigoriev, I., Akhmanova, A., … Niessen, W. (2008). Multiple object tracking in molecular bioimaging by Rao-Blackwellized marginal particle filtering. Medical Image Analysis, 12(6), 764–777.

Continue reading



Statistical Inference for Lasso

April 15, 2019

This note is based on the Chapter 6 of Hastie, T., Tibshirani, R., & Wainwright, M. (2015). Statistical Learning with Sparsity. 362..

Continue reading



Least Squares for SIMs

April 15, 2019

In the last lecture of STAT 5030, Prof. Lin shared one of the results in the paper, Neykov, M., Liu, J. S., & Cai, T. (2016). L1-Regularized Least Squares for Support Recovery of High Dimensional Single Index Models with Gaussian Designs. Journal of Machine Learning Research, 17(87), 1–37., or say the start point for the paper—the following Lemma. Because it seems that the condition and the conclusion is completely same with Sliced Inverse Regression, except for a direct interpretation—the least square regression.

Continue reading



Identifiability and Estimability

April 20, 2019

Materials from STAT 5030.

Continue reading



Self-normalized Limit Theory and Stein's Method

May 01, 2019

This note consists of the lecture material of STAT 6060 taught by Prof. Shao, four homework (indexed by “Homework”) and several personal comments (indexed by “Note”).

Continue reading



The General Decision Problem

May 06, 2019

This note is based on Chapter 1 of Lehmann EL, Romano JP. Testing statistical hypotheses. Springer Science & Business Media; 2006 Mar 30.

Continue reading



Medicine Meets AI

June 23, 2019

Last two days, I attended the conference Medicine Meets AI 2019: East Meets West, which help me know more AI from the industrial and medical perspective.

Continue reading



Surprises in High-Dimensional Ridgeless Least Squares Interpolation

June 24, 2019

This post is based on Hastie, T., Montanari, A., Rosset, S., & Tibshirani, R. J. (2019). Surprises in High-Dimensional Ridgeless Least Squares Interpolation. 53.

Continue reading



Bayesian Conjugate Gradient Method

June 27, 2019

This note is for Cockayne, J., Oates, C. J., Ipsen, I. C. F., & Girolami, M. (2018). A Bayesian Conjugate Gradient Method. Bayesian Analysis.

Continue reading



Global data association for MOT using network flows

July 10, 2019

This note is based on Li Zhang, Yuan Li, & Nevatia, R. (2008). Global data association for multi-object tracking using network flows. 2008 IEEE Conference on Computer Vision and Pattern Recognition, 1–8.

Continue reading



High Dimensional LDA

July 15, 2019

This note is for Cai, T. T., & Zhang, L. (n.d.). High dimensional linear discriminant analysis: Optimality, adaptive algorithm and missing data. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 0(0).

Continue reading



Canonical Variate Analysis

July 16, 2019

This note is based on Campbell, N. A. (1979). CANONICAL VARIATE ANALYSIS: SOME PRACTICAL ASPECTS. 243.

Continue reading



SMC-PHD Filter

July 17, 2019

This post is based on Ristic, B., Clark, D., & Vo, B. (2010). Improved SMC implementation of the PHD filter. 2010 13th International Conference on Information Fusion, 1–8.

Continue reading



Multi-estimate extraction for SMC-PHD

July 17, 2019

This post is based on Li, T., Corchado, J. M., Sun, S., & Fan, H. (2017). Multi-EAP: Extended EAP for multi-estimate extraction for SMC-PHD filter. Chinese Journal of Aeronautics, 30(1), 368–379.

Continue reading



A Optimal Control Approach for Deep Learning

July 19, 2019

This note is based on Li, Q., & Hao, S. (2018). An Optimal Control Approach to Deep Learning and Applications to Discrete-Weight Neural Networks. ArXiv:1803.01299 [Cs].

Continue reading



High-dimensional linear mixed-effect model

July 21, 2019

This post is based on Li, S., Cai, T. T., & Li, H. (2019). Inference for high-dimensional linear mixed-effects models: A quasi-likelihood approach. ArXiv:1907.06116 [Stat].

Continue reading



An Adaptive Algorithm for online FDR

July 21, 2019

This post is based on Ramdas, A., Zrnic, T., Wainwright, M., & Jordan, M. (2018). SAFFRON: An adaptive algorithm for online control of the false discovery rate. ArXiv:1802.09098 [Cs, Math, Stat].

Continue reading



The Simplex Method

July 23, 2019

This note is based on Chapter 13 of Nocedal, J., & Wright, S. (2006). Numerical optimization. Springer Science & Business Media.

Continue reading



Reluctant Interaction Modeling

July 23, 2019

This note is based on Yu, G., Bien, J., & Tibshirani, R. (2019). Reluctant Interaction Modeling. ArXiv:1907.08414 [Stat].

Continue reading



Additive Bayesian Variable Selection

August 05, 2019

This post is based on Rossell, D., & Rubio, F. J. (2019). Additive Bayesian variable selection under censoring and misspecification. ArXiv:1907.13563 [Math, Stat].

Continue reading



Interior-point Method

August 16, 2019

Nocedal and Wright (2006) and Boyd and Vandenberghe (2004) present slightly different introduction on Interior-point method. More specifically, the former one only considers equality constraints, while the latter incorporates the inequality constraints.

Continue reading



Debiased Lasso

September 08, 2019

This post is based on Section 6.4 of Hastie, Trevor, Robert Tibshirani, and Martin Wainwright. “Statistical Learning with Sparsity,” 2016, 362.

Continue reading



Likelihood-free inference by ratio estimation

September 09, 2019 0 Comments

This note is for Thomas, O., Dutta, R., Corander, J., Kaski, S., & Gutmann, M. U. (2016). Likelihood-free inference by ratio estimation. ArXiv:1611.10242 [Stat]., and I got this paper from Xi’an’s blog.

Continue reading



Basic of $B$-splines

September 09, 2019 0 Comments

This note is based on de Boor, C. (1978). A Practical Guide to Splines, Springer, New York.

Continue reading



Functional PCA

September 20, 2019

This post is based on Ramsay, J. O., & Silverman, B. W. (2005). Functional data analysis (Second edition). New York, NY: Springer.

Continue reading



Multiple human tracking with RGB-D data

September 20, 2019

This note is based on the survey paper Camplani, M., Paiement, A., Mirmehdi, M., Damen, D., Hannuna, S., Burghardt, T., & Tao, L. (2016). Multiple human tracking in RGB-depth data: A survey. IET Computer Vision, 11(4), 265–285.

Continue reading



ABC for Socks

September 24, 2019 0 Comments

This post is based on Prof. Robert’s slides on JSM 2019 and an intuitive blog from Rasmus Bååth.

Continue reading



Optimality for Sparse Group Lasso

September 29, 2019

This note is based on Cai, T. T., Zhang, A., & Zhou, Y. (2019). Sparse Group Lasso: Optimal Sample Complexity, Convergence Rate, and Statistical Inference. ArXiv:1909.09851 [Cs, Math, Stat].

Continue reading



Kernel Ridgeless Regression Can Generalize

September 30, 2019

This note is based on Liang, T., & Rakhlin, A. (2018). Just Interpolate: Kernel “Ridgeless” Regression Can Generalize. ArXiv:1808.00387 [Cs, Math, Stat].

Continue reading



Sub Gaussian

October 05, 2019

This post is based on Wainwright (2019).

Continue reading



Linear Regression with Partially Shuffled Data

October 08, 2019

This post is based on Slawski, M., Diao, G., & Ben-David, E. (2019). A Pseudo-Likelihood Approach to Linear Regression with Partially Shuffled Data. ArXiv:1910.01623 [Cs, Stat].

Continue reading



Noise Outsourcing

October 10, 2019

I learnt the term Noise Outsourcing in kjytay’s blog, which is based on Teh Yee Whye’s IMS Medallion Lecture at JSM 2019.

Continue reading



Isotropic vs. Anisotropic

October 24, 2019

I came across isotropic and anisotropic covariance functions in kjytay’s blog, and then I found more materials, chapter 4 from the book Gaussian Processes for Machine Learning, via the reference in StackExchange: What is an isotropic (spherical) covariance matrix?.

Continue reading



Partial Least Squares for Functional Data

October 31, 2019

This post is based on Delaigle, A., & Hall, P. (2012). Methodology and theory for partial least squares applied to functional data. The Annals of Statistics, 40(1), 322–352.

Continue reading



Model-based Approach for Joint Analysis of Single-cell data

October 31, 2019

This post is based on Lin Z†, Zamanighomi M, Daley T, Ma S and Wong WH†: Model-based approach to the joint analysis of single-cell data on chromatin accessibility and gene expression. Statistical Science

Continue reading



Genetic Relatedness in High-Dimensional Linear Models

October 31, 2019

This post is based on Guo, Z., Wang, W., Cai, T. T., & Li, H. (2019). Optimal Estimation of Genetic Relatedness in High-Dimensional Linear Models. Journal of the American Statistical Association, 114(525), 358–369.

Continue reading



MM algorithm for Variance Components Models

November 01, 2019

The post is based on Zhou, H., Hu, L., Zhou, J., & Lange, K. (2019). MM Algorithms for Variance Components Models. Journal of Computational and Graphical Statistics, 28(2), 350–361.

Continue reading



The Cost of Privacy

November 01, 2019

This note is based on Cai, T. T., Wang, Y., & Zhang, L. (2019). The Cost of Privacy: Optimal Rates of Convergence for Parameter Estimation with Differential Privacy. ArXiv:1902.04495 [Cs, Stat].

Continue reading



Active Contours

November 12, 2019

This post is based on Ray, N., & Acton, S. T. (2002). Active contours for cell tracking. Proceedings Fifth IEEE Southwest Symposium on Image Analysis and Interpretation, 274–278.

Continue reading



Combining $p$-values in Meta Analysis

December 04, 2019

I came across the term meta-analysis in the previous post, and I had another question about nominal size while reading the paper of the previous post, which reminds me Keith’s notes. By coincidence, I also find the topic about meta-analysis in the same notes. Hence, this post is mainly based on Keith’s notes, and reproduce the power curves by myself.

Continue reading



Fantastic Generalization Measures and Where to Find Them

December 06, 2019

The post is based on Jiang, Y., Neyshabur, B., Mobahi, H., Krishnan, D., & Bengio, S. (2019). Fantastic Generalization Measures and Where to Find Them. ArXiv:1912.02178 [Cs, Stat].which was shared by one of my friend in the WeChat Moment, and then I took a quick look.

Continue reading



Quantile Regression Forests

December 10, 2019

This post is based on Meinshausen, N. (2006). Quantile Regression Forests. 17. since a coming seminar is related to such topic.

Continue reading



Conditional Quantile Regression Forests

December 12, 2019

This note is based on the slides of the seminar, Dr. ZHU, Huichen. Conditional Quantile Random Forest.

Continue reading



Lagrange Multiplier Test

December 17, 2019

This post is based on Peter BENTLER’s talk, S.-Y. Lee’s Lagrange Multiplier Test in Structural Modeling: Still Useful? in the International Statistical Conference in Memory of Professor Sik-Yum Lee.

Continue reading



DNA copy number profiling: from bulk tissue to single cells

January 02, 2020

This post is based on the talk given by Yuchao Jiang at the 11th ICSA International Conference on Dec. 20th, 2019.

Continue reading



Concentration Inequality for Machine Learning

January 09, 2020

This post is based on the material of the first lecture of STAT6050 instructed by Prof. Wicker.

Continue reading



Classification with Imperfect Training Labels

January 15, 2020

This post is based on the talk, given by Timothy I. Cannings at the 11th ICSA International Conference on Dec. 22th, 2019, the corresponding paper is Cannings, T. I., Fan, Y., & Samworth, R. J. (2019). Classification with imperfect training labels. ArXiv:1805.11505 [Math, Stat]

Continue reading



Multiple Isotonic Regression

February 20, 2020

The first two sections are based on a good tutorial on the isotonic regression, and the third section consists of the slides for the talk given by Prof. Cun-Hui Zhang at the 11th ICSA International Conference on Dec. 21st, 2019.

Continue reading



Bernstein-von Mises Theorem

February 24, 2020

I came across the Bernstein-von Mises theorem in Yuling Yao’s blog, and I also found a quick definition in the blog hosted by Prof. Andrew Gelman, although this one is not by Gelman. By coincidence, the former is the PhD student of the latter!

Continue reading



Common Principal Components

February 28, 2020

This post is based on Flury (1984).

Continue reading



Bootstrap Sampling Distribution

March 05, 2020

This note is based on Lehmann, E. L., & Romano, J. P. (2005). Testing statistical hypotheses (3rd ed). Springer.

Continue reading



Survey on Functional Principal Component Analysis

April 25, 2020

This post is based on Shang, H. L. (2014). A survey of functional principal component analysis. AStA Advances in Statistical Analysis, 98(2), 121–142.

Continue reading



Robust Forecasting by Functional Principal Component Analysis

April 25, 2020

This post is based on Hyndman, R. J., & Shahid Ullah, Md. (2007). Robust forecasting of mortality and fertility rates: A functional data approach. Computational Statistics & Data Analysis, 51(10), 4942–4956.

Continue reading



Internal migration and transmission dynamics of tuberculosis

April 30, 2020

This post is based on Yang, C., Lu, L., Warren, J. L., Wu, J., Jiang, Q., Zuo, T., Gan, M., Liu, M., Liu, Q., DeRiemer, K., Hong, J., Shen, X., Colijn, C., Guo, X., Gao, Q., & Cohen, T. (2018). Internal migration and transmission dynamics of tuberculosis in Shanghai, China: An epidemiological, spatial, genomic analysis. The Lancet Infectious Diseases, 18(7), 788–795.

Continue reading



See all posts →