A preliminary work [S. Bourguignon et al., IEEE Trans. Signal Process., 2016, .pdf here ] studied the resolution of sparse approximation through mixed-integer programming (MIP), that is, optimisation involving both continuous and integer variables. MIP formulations are well suited to this problem, since the L0-norm naturally introduces a binary decision variable for each component (zero or non-zero?). MIP optimisation has received a lot of interest in the past fifteen years, and now benefits from very efficient numerical solvers. It was shown in this paper that:
- exact optimisation (that is, computation of the optimal solution with optimality proof) through MIP formulations is possible for some moderate size problems,
- the global minimizer(s) of the L0 problem show(s) better properties in practice than the solution of standard, suboptimal, sparse estimation algorithms.
However, the efficiency of such MIP methods remains limited by the problem size, the sparsity degree of the solution and the level of noise. In the former paper, rather standard operations research formulations were used and resolution was performed by a generic solver (IBM ILOG CPLEX).
The MIMOSA project aims to study various levers in order to overpass the limits of using generic MIP solvers and then tackle larger and more complex problems. The proposed ideas are centred on exploiting a certain knowledge of the problem from the sparse approximation perspective, in order to build more efficient algorithms: How can prior knowledge on the problem and/or information extracted from the data help reducing the computational cost? How can state-of-the-art suboptimal algorithms dedicated to sparse optimisation be integrated as inner blocks of a global optimisation strategy?
Work Package 1: Mathematical modelling and formulation refinements.
A first research axis concerns different possible mathematical formulations of sparse optimisation as MIP. They rely on introducing additional binary optimisation variables bi, such that bi=0 ⇔ xi=0. Then, the non-linear sparsity measure ‖x‖0 reads linearly ∑ bi. However, translating the former logical constraint into a mathematical program compatible with MIP is a cumbersome operation. One classical solution in operations research – referred to as bigM reformulation – assumes that the solution is upper-bounded by some sufficiently large value M>0. Then, one can introduce linear inequality constraints -Mbi ≤ xi ≤ Mbi. Tuning the value of M is a critical issue, since it must be as tight as possible in order to improve numerical performance. Several options will be studied.
- In some sparsity-driven signal processing applications, statistical rules can be investigated to infer reasonable values of M from the problem structure and from the data: can we derive a reasonably tight upper bound on |xi|, from the knowledge of the data and/or from prior information about the noise statistics? Could we even derive adaptive values Mi for each unknown |xi|? Obtaining values of Mi as tight as possible should strongly impact computational performance. A more complex idea considers the possibility of refining the bounds along the combinatorial tree-search strategy, see WP 2 about implementation of specific solvers.
- Second, alternatives to bigM reformulations will be investigated, such as the use of bilinear equality constraints xi(1-bi)=0. Although such constraints are not linear, which might be less efficient computationally, they do not introduce any additional parameter, therefore their efficiency in the context of sparse approximation requires more investigations.
- Another approach will study the use of exact piecewise linear or piecewise quadratic relaxation of the L0 norm, recently proposed in the literature – exact meaning that both solutions of the L0 problem and of the relaxed one have the same global minimizers. It should be possible to convert the corresponding relaxed problem into a MIP, without any additional parameter tuning.
Work Package 2: Mixed-integer programming resolution: from generic to specific solvers.
Most works addressing sparse estimation via MIP use general-purpose solvers. Consequently, once the problem has been reformulated as a MIP, the solver somehow becomes blind to the initial problem structure. Therefore, integrating knowledge on the sparse approximation problem is a potentially powerful lever for improving efficiency of MIP resolution strategies.
Most MIP solvers rely on a branch-and-bound strategy, i.e., a tree-structured implicit enumeration algorithm, based on successive continuous relaxations of the integer variables. Each branch generates a subproblem by fixing some binary variables, and a set of branches is constructed corresponding to the different possible configurations of such variables. Then, the aim is to discard huge parts of the remaining combinatorial tree by lower bounding the objective function on the corresponding subproblems. To obtain such lower bounds, a continuous relaxation of each subproblem is formed by relaxing the corresponding binary variables into the interval [0,1]. Solving the subproblem then amounts to linear or quadratic programs, which can be solved efficiently.
Efficiency of the branch-and-bound algorithm essentially relies on defining efficient strategies to explore the binary tree: which binary variables should be fixed, and which should be relaxed? How to move from one branch to another? Should one change the configuration of the fixed variables, or go more deeply into the tree and change the set of relaxed variables? Whereas generic heuristic decision rules are implemented within off-the-shelf solvers, exploiting local exploration strategies proposed in the field of sparse approximation may help to answer these questions. For example, greedy-like procedures propose several rules for switching from one binary configuration to another. Although such algorithms do not come with strong optimality guarantee, they can be advantageously introduced for designing prioritary local exploration strategies (that is, the choice of new branches), and then find better feasible solutions more quickly than do generic branching rules.
Work Package 3: Applications
The works in WP 1 and WP 2 will be adapted to application cases, whose specificities may also constitute supplementary levers in order to improve optimisation efficiency.
- In order to gain efficiency in the MIP resolution, one may think in incorporating additional constraints to the problem. Indeed, many signal processing problems naturally involve constraints on the searched coefficients. For example, in some ultrasonic spike deconvolution problems, the sign of each echo may be predictable, hence inducing positivity or negativity constraints. This is also the case in hyperspectral unmixing, where the unknown variables represent fractional abundances, and therefore are positive and sum to one. Since such constraints are linear, MIP-based approaches can easily be adapted to such cases (as far as pixel-wise spectral unmixing is considered). One important point here is that adding such constraints may contribute to reducing the computational time, whereas it generally penalises the efficiency of classical (convex or greedy) sparse approximation algorithms.
- Global optimisation of criteria involving structured sparsity is also worth being studied. Structured sparsity – subsets of coefficients are jointly zero or non-zero – has become a very active field of research for modelling different interactions between the unknowns. Related estimation problems are generally tackled via suboptimal methods, either based on convex optimisation approaches involving mixed norms or by extensions of greedy algorithms. Exact optimisation of such problems through MIP should also be tractable for moderate-size problems, with a supplementary effort required in modelling the dependency between binary variables. A concrete example concerns sparse spectral analysis, where sparsity operates on variables in ℝ2, corresponding to the real and imaginary parts of complex-valued Fourier coefficients.