# A simple ISTA ↔ FISTA switch

Today, let us revisit the topic of enforcing sparsity and see an easy trick to accelerate the convergence speed of the algorithm (demo).

The basic algorithm that we discussed last time is the iterative shrinkage/thresholding algorithm (ISTA) that can be specified as follows

where $t = 1, 2, 3, \dots$ is the iteration number, $\mathbf{y}$ is the measurement vector, $\mathbf{H}$ is the measurement matrix that models the acquisition system, $\gamma > 0$ is a step-size of the algorithm that we can always set to the inverse of the largest eigenvalue of $\mathbf{H}^\mathrm{T}\mathbf{H}$ to ensure convergence (i.e., set $\gamma = 1/L$ with $L = \lambda_{\text{max}}(\mathbf{H}^\mathrm{T}\mathbf{H})$), $\tau > 0$ is the regularization parameter that controls the sparsity of the final solution (larger $\tau$ leads to a sparser solution), and finally $\eta$ is a scalar thresholding function applied in a component-wise fashion. One of the most popular thresholding functions is the soft-thresholding defined as

where $(\cdot)_+$ returns the positive part of its argument, and $\mathrm{sgn}(\cdot)$ is a signum function that returns $+1$ if its argument is positive and $-1$ when it is negative.

ISTA is a very well understood method, and it is well known that its rate of convergence corresponds to that of the gradient-descent method, which is $O(1/t)$.

Let us considering the following simple iteration

where $\mathbf{f}^0 = \mathbf{s}^0 = \mathbf{f}_{\text{init}}$. When $q_t = 1$ for all $t = 1, 2, 3, \dots$ the iteration corresponds to ISTA, however, an appropriate choice of$\{q_t\}_{t \in [0, 1, \dots]}$ leads to a faster $O(1/t^2)$  convergence, which is crucial for larger scale problems where one tries to reduce the amount of matrix-vector products with $\mathbf{H}$ and $\mathbf{H}^\mathrm{T}$. The faster version of ISTA was originally proposed by Beck & Teboulle 2009 and is widely known as fast ISTA (FISTA).

So, what is that appropriate choice of $\{q_t\}_{t \in [0, 1, \dots]}$?

Beck & Teboulle proposed to initialize $q_0 = 1$ and then setting the rest iteratively as follows

A short note on the Python demo. It was done as an IPython notebook in a fully self-contained way. The first two cells create and save two files matrixmodel.py and algorithm.py that are re-usable as stand-alone files.

The switch from ISTA to FISTA is done in cell 8:

# Reconstruct with ISTA
[fhatISTA, costISTA] = fistaEst(y, forwardObj, tau, numIter, accelerate=False)

# Reconstruct with FISTA
[fhatFISTA, costFISTA] = fistaEst(y, forwardObj, tau, numIter, accelerate=True)

The output of the demo is the following figure plotting the results of both algorithms within 100 iterations: