WeiYa's Work Yard

A dog, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Reproduce Auto-modeling on Many-Normal-Means

Lijun Wang (2022.07.31)

\[\newcommand\bbP}\notag\]

Here is my attempt on reproducing the results of auto-modeling on the many-Normal-means example.

The main algorithm is

ksnip_20220731-083348_083351

\[\pi^{(n)}(\theta,\lambda) = \sum_{k=2}^m \lambda_k\vert \theta_k\vert = \sum_{k=2}^m \lambda_k\theta_k\]

and hence

\[\frac{\partial \pi^{(n)}(\theta,\lambda)}{\partial \theta} = % \begin{bmatrix} % 0 \\ % \lambda_2\\ % \vdots\\ % \lambda_m\\ % 0\\ % \vdots\\ % 0 % \end{bmatrix} \begin{bmatrix} 0 & \lambda_2 & \lambda_3 & \cdots & \lambda_m & 0 & \cdots & 0 \end{bmatrix}^T\]

And note that $\lambda_i\ge 0$, it follows that \(\lambda_i = \max(0, d_i), i=2,\ldots,m\) where $d_i$ is the $i$-th element of \(E_{(X, Y)\sim \bbP}[\frac{\partial L(\theta\mid X, Y)}{\partial \theta}] - \frac 1n\sum_{i=1}^n\frac{\partial L(\theta\mid x_i, y_i)}{\partial \theta}\) For the many-Normal-means example,

\[\begin{align*} L_i \triangleq L(\theta\mid y_i) &= -\log \sum_{k=1}^m \alpha_k \exp\left(-\frac{(y_i-\eta_k)^2}{2}\right)\\ &= -\log \sum_{k=1}^m \theta_{m+k} \exp\left(-\frac{(y_i-\sum_{j=1}^k\theta_j)^2}{2}\right)\,, \end{align*}\]

then

\(\frac{\partial L_i}{\partial \theta_\ell} = \begin{cases} -\frac{\sum_{k=\ell}^m\theta_{m+k}\exp\left(-\frac{(y_i-\sum_1^k\theta_j)^2}{2}\right)(y_i-\sum_1^k\theta_j)}{\sum_{k=1}^m\theta_{m+k}\exp\left(-\frac{(y_i-\sum_1^k\theta_j)^2}{2}\right)} & \ell\le m\\ -\frac{\exp\left(-\frac{(y_i-\sum_1^{\ell-m}\theta_j)^2}{2}\right)}{\sum_{k=1}^{m}\theta_{m+k}\exp\left(-\frac{(y_i-\sum_1^k\theta_j)^2}{2}\right)}& \ell\ge m+1 \end{cases}\) For simplicity, I skip the step of approximation of $\bbP$ by bootstrap samples, and instead directly use the true $\bbP$.

My pseudo Julia code is as follows,

function auto_modeling()
    for i = 1:N
        θold = θ
        λold = λ
        λ = sol_λ_given_θ(y, θ, ...)
        θ = sol_θ_given_λ(y, λ, ...)
        if ( θold - θ  < tol) and ( λold - λ  < tol)
            break
        end
    end
end
function sol_λ_given_θ(y, θ, ...)
    # Equations (3) (4) (5)
end
function sol_θ_given_λ(y, λ, ...)
    model = Model(Ipopt.Optimizer)
    # express the optimization problem `min G` in language of JuMP and Ipopt
    optimize!(model)
end

As for g-modeling, call deconv function from the R package deconvolveR.

Like in the paper, I repeat 200 times, then report the average mean square error. The results are as follows:

Compared to the results in the paper,