Abstract
In this paper, we propose a method for finding an optimum solution of the nonlinear optimization problem with equality constraints. The original problem is replaced by a sequence of unconstrained problems of the augmented Lagrangian function. The subproblems are minimized by using quasi-Newton methods. The Hessian matrix of the augmented Lagrangian is updated by using a new secant approximations which is positive definite at every iteration. A set of computational results on test problems from CUTEr collection are presented.
Keywords
Introduction
Many applications in engineering, decision science, operations research and other branches are formulated as constrained optimization problems. In these applications, the variables may interrelated by physical laws like the conservation of mass or energy, Kirchhoffs voltage and current laws, and other system equalities or inequalities that must be satisfied [3, 43].
We consider the nonlinear equality constrainedproblem:
One of the major developments and successes in continuous optimization is the efficient quasi-Newton methods for solving minimization problems. The important reasons for this success are:
Positive definite quasi-Newton updates is compatible with line search rules that maintain global convergence. The Hessian matrix can be approximated at every iteration and positive definiteness still preserved. In a neighborhood of a strong local minimizer the Hessian matrix maintains positive definiteness. Unit step size helps for rapid local convergence.
Since we are concerned in equality constraints, any inequality constraints have been converted to equality by adding slack variables. The idea of transforming a constrained optimization problem to a sequence of unconstrained problems has played a main role in the formulation of algorithms since the 1960s [18].
An effective method for solving (1) involves an unconstrained function based on the quadratic penalty function, which consist of the function f in addition to the sum of the squares of the constraint violations. Thus at every iteration an easier optimization problem is to be minimized and the solution is obtained when the penalty parameter approaches infinity. Penalty methods have been grown speedily in the past years. These methods consist of two levels: inner and outer iterations. So a constrained optimization problem can be solved via a two-level cycle: in inner iteration, we start from a given point and use any local minimization method to minimize the penalty function, and in outer iteration, we test for convergence and set the value of the penalty parameter. These methods regarded as unfashionable and numerically unstable, fundamentally because of the inevitable ill-conditioning that occurs as the penalty parameter tends to infinity.
One of the most important penalty methods in nonlinear programming is the augmented Lagrangian methods, also known as the multiplier methods. Have gained their popularity for the following reasons:
They avoid the ill-conditioning difficulties encountered by the classical penalty algorithms like the Courant penalty method or log-barrier approaches which result as the penalty parameter goes to infinity [4–6, 19]. It dispenses with the need for iterates to stay strictly feasible with respect to the inequality constraints [36]. It does not make any non smoothness, like Sl1QP and Sl∞QP methods [36]. An advantage of these methods are that they can be applied matrix-free [2, 11] and still have fast local convergence under relatively weak assumptions [17, 29]. Another feature of the Augmented Lagrangian approach is that the solution of the inner iteration can be obtained using algorithms that can deal with a large number of variables without making use of factorization of matrices of any kind.
The work on augmented Lagrangian methods proposed first by Hestenes [28], Powell [39] and Rockfellar [40], and hence was adapted by Conn et al. [11]. Augmented Lagrangian function is an unconstrained function that contain both a Lagrange multiplier term and a quadratic penalty term which does not require for convergence that penalty parameter approach infinity. Augmented Lagrangian methods decreases the probability of ill-conditioning by introducing Lagrange multiplier estimates at every outer iteration into the minimized function. Bertsekas [5] shows that the method possesses at least linear convergence, and at a rate that is more satisfactory than that of the quadratic penalty method.
Since the 1990s, the interest in augmented Lagrangian methods has attracted much attention, and many algorithms based on using the augmented Lagrangian as an objective function for successive unconstrained minimization is proposed.
The implementations can be built from the methods and software for unconstrained optimization. Effective algorithms for solving nonlinear minimization problems based on the augmented Lagrangian function have been proposed [13–15, 31]. One important work on practical augmented Lagrangian method was the package LANCELOT [12] that is based on the paper by Conn et al. [11]. LANCELOT algorithm treats a problem containing equality constraints and bound constraints to find the approximate solution of the constrained problem. Similarly, the package MINOS invented by Murtagh and Saunders [34] is a software for solving large-scale optimization problems, that finds the solution of a sequence of subproblems in which the constraints are linearized and the objective is an augmented Lagrangian function.
Augmented Lagrangian functions can be applied as a merit function for sequential quadratic programming (SQP) methods [9, 42]. These methods find an approximate solution of a sequence of quadratic programming subproblems in which a quadratic model of the objective function is minimized subject to linearization of the constraints. Evolutionary Algorithms are a class of stochastic optimization search methods that have been successfully applied to solve optimization problems such as genetic algorithms, evolutionary programming, evolution strategies, Particle Swarm Optimization, and their variants [1, 37].
A filter-typed method was present by Niu and Yuan [35] for nonlinear constrained optimization problems. The inner iteration is to find a solution of a quadratic approximation to the augmented Lagrangian function in the trust region, instead of minimizing the standard augmented Lagrangian function. Good numerical results are given, but, no convergence results are presented in that paper.
Standard trust region SQP methods may facing infeasible subproblems [44]. Sl1QP and Sl∞QP are designed to solve this situation, but these two approaches still involve difficulties arisen from nonsmoothness [44]. The augmented Lagrangian method attracts much attention due to its applications to sparse optimization in compressive sensing and low rank matrix optimization problems. So it may be seem that a reasonable choice is to try to minimize the augmented Lagrangian function but with a new quasi-newton updates for the Hessian matrix. With this we solve the difficulties from infeasibility and avoid the nonsmoothness problem.
The purpose of this paper is to propose, analyze, and present an algorithm for the constrained minimization problems based on the sequential minimization of the augmented Lagrangian function to overcome the disadvantages described above. The original problem is replaced by a sequence of unconstrained minimization problems. The subproblems are minimized by using new quasi-Newton method where the Hessian matrix of the augmented Lagrangian is positive definite at every iteration.
This paper is organized as follows. In Section 2 the algorithm and the technique used to approximate the Hessian matrix are proposed. The stability of the search direction is given in Section 3. An applications of the algorithm to nonlinear equality constrained optimization problems are in Section 4. Conclusions are presented in Section 5. Throughout the paper, ∥.∥ denotes the Euclidean norm of vectors.
The augmented Lagrangian function for general constrained optimization problem (1) is defined by
If λ∗ is the Lagrange multiplier vector associated with a local minimum x∗ and the penalty parameters are sufficiently large, then x∗ is a strict local solution of L (x, λ∗, μ) [36]. Hence from this theoretical result given a point x
k
, a new iterate xk+1 can be found by solving the subproblem:
Therefore, by using the augmented Lagrangian function we try to generate a sequence of points {x k }, each of them is an approximate solution of (3) and update λ k and μ k to make xk+1 feasible point that is to reduce ∥c (xk+1)∥ to certain error.
The unconstrained subproblem can be minimized by using the quasi-Newton methods which have been combined with line search methods. The step size is the optimal point for the problem
Quasi-Newton methods is an efficient method for solving (3). In quasi-Newton methods the Hessian matrix does not need to be computed, so it can be used when the Hessian is unavailable or difficult to calculate. Several recent computational studies have shown that the quasi-Newton methods are effective methods for solving minimization problems. These methods use only first derivatives to make an approximation to the inverse of Hessian matrix at each step instead of performing the computational work of evaluating and inverting Hessian.
The idea of a line search is to control the length of the used direction by solving the one-dimensional problem (5). The gradient of (3) can computed, we can effectively do a one-dimensional search with derivatives. When the search direction is a descent direction, which can be occur if the Hessian matrix is positive definite, the line search can find a positive step α > 0. Since the search direction is not exactly the right direction, there is no need to find an exact minimum, usually it is enough to move towards the optimum until convergence. One condition that measures progress is called the Armijo condition, which we will use to find an optimal step length.
For the stopping criterion, given a sequence
A critical point in the development of the algorithm is the strategies of updating penalty parameters and Lagrange multipliers. Every update has different effects on the efficiency of algorithms. The algorithm must generate a nondecreasing sequence {μ k } to maintain convergence. The algorithm will contain m different penalty parameters, every component of c (x) has one penalty parameter.
The following strategy will be applied: the penalty parameter is increased if sufficient improvement of constraint violations is not obtained. To be more exact, if ∣c
i
(xk+1)∣ is not sufficiently less than ∣c
i
(x
k
)∣, then the penalty parameter is increased to satisfy
We update the multiplier vector based on the current information, in order to be more accurately keeping it as a constant through the procedure of minimizing the augmented Lagrangian function then update it in the outer iteration by some formula. We employ a first order formula for updating the Lagrange multiplier vector, starting with λ0 = 0,
Update of Hessian matrix
In the inner iteration the unconstrained subproblem can be solved by minimizing the augmented Lagrangian starting from the current iterate x k and generate a better minimizer xk+1 by using quasi-Newton methods.
Let Ak+1 be an approximation of
Let Bk+1 be the inverse of Ak+1 at xk+1 we present an update formula of rank two as follows [41]
or in the form
From the previous discussions the inner and outer iterations are summarized in the following algorithms.
Given Find an approximate minimizer xk+1 of (3) by using Algorithm 2.2 If ∥ ∇
x
L (xk+1, λ
k
, μ
k
) ∥ ≤ ∊1 and ∥c (xk+1) ∥ ≤ ∊2, stop with approximate solution xk+1 as a minimizer of the constrained optimization problem (1). Update the Lagrange multiplier vector with
Choose new penalty parameters such that: for i = 1,…, m, set
Set k = k + 1 and go to step 1.
The iterative procedure of the inner iteration which is used in step 1 of Algorithm 2.1 can be given as follows:
Start with an initial point Compute the gradient of the augmented Lagrangian function, g
k
, at point x
k
, and set
Find an acceptable stepsize α
k
in the direction d
k
and set
Test the new point xk+1 for optimality. If Update the Hessian matrix. Set x
k
= xk+1, B
k
= Bk+1 and go to step 1.
In Algorithm 2.2
It is usual for descent methods to be stable because one ensures that the function to be minimized is decreased by each step. It will be shown in this section that the direction of search -B
k
g
k
is downhill, so α
k
can always be chosen to be positive. The direction will be descent if and only if
The main aim of this section is to examine the numerical performance of the algorithm on a set of test problems. The proposed algorithm is executed in Fortran 90 on a PC running ubuntu 14.04.2. Algorithm 2.1 is tested with the results given in [25], thus we test 71 equality constrained optimization problems from CUTEr [27]. In our implementations, the maximum number of iterations is 500. Also the maximum number of function evaluations is set as 4000 and we use the default starting point x0 for each problem. The Armijo condition is used to determine the step length and the termination condition for each problem is
Tables 1 and 2 shows the computation results, where for each problem we have the following meanings
n: denote the number of variables; m: is the number of equality constraints; Iter: the total number of iterations; n
f
: the total number of function evaluations.
In CUTEr the function value and constraints values can be obtained simultaneously by calling the subroutine " cfn ", thus n
f
denotes the number of " cfn " calls. A table entry " F " means that the corresponding algorithm terminates unsuccessfully where the optimum point can not be reached because the number of function evaluations or the number of iterations exceeds the maximum number.
Test results for problems (A–H)
Test results for problems (A–H)
Test results for problems (I–Z)
From Tables 1 and 2, we can see that the total number of iterations in Algorithm 2.1 is less than that of both methods for almost all the problems. Algorithm pdSQP was unable to solve the 8 cases mss1, lukvle6, lukvle8, lukvle13, lukvle14, lukvle16, orthrds2, and orthrega and algorithm SNOPT has 4 problems unsolved these are lukvle9, lukvle10, lch, and orthrgds. But Algorithm 2.1 was more robust than pdSQP or SNOPT with only 2 problems that are unsolved and for the other problems the solution point was reached successfully. Finally for some problems Algorithm 2.1 requires fewer number of function evaluations than pdSQP or SNOPT. Therefore we can say that the new algorithm is superior to both pdSQP and SNOPT in a certain extent.
An augmented Lagrangian algorithm is proposed for solving nonlinear equality constrained optimization problems. The Hessian matrix of augmented Lagrangian function in the unconstrained subproblems is updated using new Hessian approximations. The search direction based on these matrices is decent, since the Hessian matrix is positive definite for all values of φ ≥ 1. The formula (12) or (13) represent a class of approximating matrices as a function of a scalar parameter φ that includes the DFP and BFGS methods as special cases. Numerical results of equality constrained problems from CUTEr are presented. The algorithm has been compared with algorithms pdSQP and SNOPT and the numerical tests of Algorithm 2.1 seem to be promising.
