# Interior-point Method

##### Posted on

Nocedal and Wright (2006) and Boyd and Vandenberghe (2004) present slightly different introduction on Interior-point method. More specifically, the former one only considers equality constraints, while the latter incorporates the inequality constraints.

## Nocedal and Wright (2006)

Consider the LP problem in standard form,

where $c,x\in\IR^n$ and $b\in\IR^m$, and $A$ is an $m\times n$ matrix with full row rank.

The dual problem is

where $\lambda\in \IR^m$ and $s\in \IR^n$. Solutions of \eqref{primal} and \eqref{dual} can be characterized by the KKT conditions,

Restate the optimality conditions in a slightly different form by means of a mapping $F$ from $\IR^{2n+m}$ to $\IR^{2n+m}$:

where $X=\diag(x_1,\ldots,x_n), S=\diag(s_1,\ldots,s_n)$ and $e=(1,\ldots,1)^T$. Primal-dual methods generates iterates $(x^k,\lambda^k,s^k)$ that satisfy the bounds strictly, i.e., $x^k > 0$ and $s^k > 0$. This property is the origin of the term **interior-point**.

Like most iterative algorithms in optimization, primal-dual interior-point methods have two basic ingredients: a procedure for determining the step and a measure of the desirability of each point in the search space.

Firstly, the **duality measure** is defined by

The procedure for determining the search direction has its origins in Newton’s method. Newton’s method forms a linear model for F around the current point and obtains the search direction $(\Delta x,\Delta \lambda, \Delta s)$ by solving the following system of linear equations

where $J$ is the Jacobian of $F$. Let $r_c$ and $r_b$ be the first two block rows in $F$,

then

Note that $J_{ij} = \frac{\partial f_i}{\partial x_j}$, i.e., $J_{i\cdot} = \frac{\partial f_i}{\partial x’}$, so we have $\frac{\partial f_1}{\partial \lambda’} = A^T$.

Usually a full step along this direction would violate the bound $(x,s)\ge 0$, so we perform a line search along the Newton direction and define the new iterate as

for some line search parameter $\alpha\in (0,1]$. We often can take only a small step along this direction ($\alpha < < 1$) before violating the condition $(x,s) > 0$. Hence, the pure Newton direction, sometimes known as the **affine scaling direction**, often does not allow us to make much progress toward a solution.

Most primal-dual methods use a less aggressive Newton direction, aim for a point whose pairwise products $x_is_i$ are reduced to a lower average value–not all the way to zero. Take a Newton step toward the point for which $x_is_i=\sigma\mu$, where $\mu$ is the current duality measure and $\sigma\in[0, 1]$ is the reduction factor that we wish to achieve in the duality measure on this step. The modified step equation is then

Here $\sigma$ is called the **centering parameter**.

Actually, software for implementing interior-point methods does not usually start from a feasible point $(x^0,\lambda^0,s^0)$.

### Central Path

- feasible set $\cF=\{(x,\lambda,s)\mid Ax=b,A^T\lambda+s=c,(x,s)\ge 0\}$
- strictly feasible set $\cF^0 = \{(x,\lambda,s)\mid Ax=b,A^T\lambda+s=c,(x,s)> 0\}$

The central path $\cC$ is an arc of strictly feasible points, which is parametrized by a scalar $\tau > 0$, and each point $(x_\tau,\lambda_\tau,s_\tau)\in\cC$ satisfies the following equations:

Then define the central path as

### Central Path Neighborhoods and Path-Following Methods

Path-following algorithms explicitly restrict the iterates to a neighborhood of the central path $\cC$ and follow $\cC$ to a solution of the linear program.

Two most interesting neighborhood of $\cC$ are

- $\cN_2(\theta) = \{(x,\lambda, s)\in \cF^o\mid \Vert XSe-\mu e\Vert_2\le \theta\mu\}$ for some $\theta\in [0,1)$
- $\cN_{-\infty}(\gamma) = \{(x,\lambda, s)\in \cF^o\mid x_is_i\ge \gamma\mu\;\forall i=1,2\ldots,n\}$

## Boyd and Vandenberghe (2004)

Consider the convex optimization problems that include inequality constraints,

where $f_0,\ldots,f_m:\IR^n\rightarrow \IR$ are convex and twice continuously differentiable, and $A\in \IR^{p\times n}$ with rank $p < n$.

**Assume that the problem is solvable, i.e., an optimal $x^\star$ exists.**Denote the optimal value $f_0(x^\star)$ as $p^\star$.- Assume that the problem is strictly feasible, i.e., there exists $x\in\cD$ that satisfies $Ax=b$ and $f_i(x) < 0$ for $i=1,\ldots,m$.

Thus, there exist dual optimal $\lambda^\star\in\IR^m,\nu^\star\in\IR^p$, which together with $x^\star$ satisfy the KKT conditions

### Logarithmic barrier function and central path

**The goal** is to approximately formulate the inequality constrained problem as an equality constrained problem to which Newton’s method can be applied.

Rewrite \eqref{11.1} to make the inequality constraints implicit in the objective,

where $I_{\_}:\IR\rightarrow \IR$ is the indicator function for the nonpositive reals,

The basic idea of the barrier method is to approximate the indicator function $I_{\_}$ by the function

where $t>0$ is a parameter that sets the accuracy of the approximation. Unlike $I_{\_ }$, $\hat I_{\_}$ is differentiable and closed: it increases to $\infty$ as $u$ increases to 0.

Thus, we have the approximation

The function

with $\text{dom} \phi=\{x\in\IR^n\mid f_i(x) < 0,i=1,\ldots,m\}$ is called the **logarithmic barrier** or **log barrier**.

We can simplify notation if multiplying the objective by $t$, and get the equivalent problem

For $t > 0$, define $x^\star(t)$ as the solution of \eqref{11.6}. The **central path** associated with problem \eqref{11.1} is defined as the set of points $x^\star(t), t>0$, which we call the **central points**. Points on the central path are characterized by the following necessary and sufficient conditions: $x^\star(t)$ is strictly feasible, i.e., satisfies

and there exists a $\hat \nu\in\IR^p$ such that

It follows that every central point yields a dual feasible point, and hence a lower bound on the optimal value $p^\star$. More specifically, define

It claims that the pair $\lambda^\star(t),\nu^\star(t)$ is dual feasible.

### The barrier method

Take $t=m/\epsilon$,

The central path conditions \eqref{11.7} can be interpreted as a continuous deformation of the KKT optimality conditions. A point $x$ is equal to $x^\star(t)$ iff there exists $\lambda,\mu$ such that

which can be solved by Newton step (ignoring the inequality).

Firstly, we can eliminate the variables $\lambda_i = -1/(tf_i(x))$,

Write in matrix form,

then we can get the Newton step.

### Primal-dual search direction

Define

and $t > 0$. The first block component of $r_t$,

is called the **dual residual**, and the last block component, $r_{pri}=Ax-b$ is called the **primal residual**. The middle block,

is the **centrality residual**. The Newton step is