Let $M_n$ be random functions and let $M$ be a fixed function of $\theta$ such that for every $\varepsilon > 0$

Then any sequence of estimators $\hat\theta_n$ with $M_n(\hat\theta_n)\ge M_n(\theta_0)-o_P(1)$ converges in probability to $\theta_0$.

Let $\Psi_n$ be random vector-valued functions and let $\Psi$ be a fixed vector-valued functions of $\theta$ such that for every $\varepsilon > 0$

Then any sequence of estimators $\hat\theta_n$ such that $\Psi_n(\hat\theta_n)=o_P(1)$ converges in probability to $\theta_0$.

Let $\Theta$ be a subset of the real line and let $\Psi_n$ be random functions and $\Psi$ a fixed function of $\theta$ such that $\Psi_n(\theta)\rightarrow \Psi(\theta)$ in probability for every $\theta$. Assume that each map $\theta\mapsto \psi\Psi_n(\theta)$ is continuous and has exactly one zero $\hat\theta_0$, or is nondecreasing with $\Psi_n(\hat \theta_n)=o_P(1)$. Let $\theta_0$ be a point such that $\Psi(\theta_0-\varepsilon) < 0 < \Psi(\theta_0+\varepsilon)$ for every $\varepsilon > 0$.

Let $X_1,\ldots,X_n$ be a sample from some distribution $P$, and let a random and a “true” criterion function be of the form:

Assume that the estimator $\hat\theta_0$ is a zero of $\Psi_n$ and converges in probability to a zero $\theta_0$ of $\Psi$. Because $\hat\theta_n\rightarrow\theta_0$, expand $\Psi_n(\hat\theta_n)$ in a Taylor series around $\theta_0$. Assume for simplicity that $\theta$ is one-dimensional, then

where $\tilde \theta_n$ is a point between $\hat\theta_n$ and $\theta_0$. This can be rewritten as

here the invertibility of the matrix $P\dot\psi_{\theta_0}$ is a condition.

For each $\theta$ in an open subset of Euclidean space, let $x\mapsto \psi_\theta(x)$ be a measurable vector-valued function such that, for every $\theta_1$ and $\theta_2$ in a neighborhood of $\theta_0$ and a measurable function $\dot\psi$ with $P\dot\psi^2<\infty$ and that the map $\theta\mapsto P\psi_\theta$ is differentiable at a zero $\theta_0$, with nonsingular derivative matrix $V_{\theta_0}$. If $\bbP_n\psi_{\hat\theta_n}=o_P(n^{-1/2})$, and $\hat\theta_n\pto \theta_0$, then
$$
\sqrt n(\hat\theta_n-\theta_0) = -V_{\theta_0}^{-1}\frac{1}{\sqrt n}\sum_{i=1}^n\psi_{\theta_0}(X_i)+o_P(1)\,,
$$
In particular, the sequence $\sqrt n(\hat\theta_n-\theta_0)$ is asymptotically normal with mean zero and covariance matrix $V_{\theta_0}^{-1}P\psi_{\theta_0}\psi_{\theta_0}^T(V_{\theta_0}^{-1})^T$.

The function $\theta\mapsto \sign(x-\theta)$ is not Lipschitz, the Lipschitz condition is apparently still stronger than necessary.

For each $\theta$ in an open subset of Euclidean space let $x\mapsto m_\theta(x)$ be a measurable function such that $\theta\mapsto m_\theta(x)$ is differentiable at $\theta_0$ for $P$-almost every $x$ with derivative $\dot m_{\theta_0}(x)$ and such that, for every $\theta_1$ and $\theta_2$ in a neighborhood of $\theta_0$ and a measurable function $\dot m$ with $P\dot m^2<\infty$
$$
\vert m_{\theta_1}(x) - m_{\theta_2}(x)\vert \le \dot m(x) \Vert \theta_1-\theta_2\Vert\,.
$$
Furthermore, assume that the map $\theta\mapsto Pm_\theta$ admits a second-order Taylor expansion at a point of maximum $\theta_0$ with nonsingular symmetric second derivative matrix $V_{\theta_0}$. If $\bbP m_{\hat\theta_n}\ge \sup_\theta\bbP_nm_\theta - o_P(n^{-1})$ and $\hat\theta_n\pto \theta_0$, then
$$
\sqrt n(\hat\theta_n-\theta_0) = -V_{\theta_0}^{-1}\frac{1}{\sqrt n}\sum_{i=1}^n\dot m_{\theta_0}(X_i) +o_P(1)\,.
$$
In particular, the sequence $\sqrt n(\hat\theta_n-\theta_0)$ is asymptotically normal with mean zero and covariance matrix $V_{\theta_0}^{-1}P\dot m_{\theta_0}\dot m_{\theta_0}^TV_{\theta_0}^{-1}$.