The conjugate gradient (CG) techniques are a class of unconstrained optimization algorithms with strong local and global convergence qualities and minimal memory needs. While the quasi-Newton methods are reliable and efficient on a wide range of problems and these methods are converge faster than the conjugate gradient methods and require fewer function evaluations, however, they are request substantially more storage, and if the problem is ill-conditioned, they may require several iterations. There is another class, termed preconditioned conjugate gradient method, it is a technique that combines two methods conjugate gradient with quasi-Newton. In this work, we proposed a new two limited memory preconditioned conjugate gradient methods (New1 and New2), to solve nonlinear unconstrained minimization problems, by using new modified symmetric rank one (NMSR1) and new modified Davidon, Fletcher, Powell (NMDFP), and also using projected vectors. We proved that these modifications fulfill some conditions. Also, the descent condition of the new technique has been proved. The numerical results showed the efficiency of the proposed new algorithms compared with some standard nonlinear, unconstrained problems.
Consider the following nonlinear unconstrained minimization problem
where f : Rn → R is a twice continuously differentiable real-valued function. A nonlinear conjugate gradient algorithms to solve (1) have the form
which is iterative technique and it is starting from an initial guess x1 ∈ Rn and the positive step length αm is calculated to satisfy the strong Wolfe conditions and dm is the search direction. For the preconditioned conjugate gradient techniques the search directions dm are generated as:
where gm+1 = ∇ f (xm+1) and Hm+1 is the inverse Hessian approximation and it is chosen to satisfy the following quasi-Newton equation
The quasi-Newton method is avoids the requirement to compute Hessian matrices by repeating from iteration to iteration.
We obtain the corresponding update formula for the inverse Hessian approximation [1]
where ym = gm+1 - gm and vm = xm+1 - xm .
Yang, Xu and Gao [2] made a little modification for self scaling symmetric rank one update wth Davidon’s optimal condition [3] (SHSR1) as follows
where ρm is the scaling factor, , ϑm = 6 (fm - fm+1) + 3 (gm + gm+1) Tvm.
There are other methods to find the inverse Hessian approximation, like Davidon, Fletcher and Powell (DFP), which has the form [1]:
and Broyden, Fletcher, Gold Farb and Shanno Method (BFGS), which has the form [1]:
The parameter βm in (3) is a conjugacy parameter. The most well-know formulas of βm are called Hestenes-Stiefel (H/S) [4], Polak-Ribiere-Polyak (PRP) [5], Dai –Yuan (D/Y) [6] and Fletcher-Reeves(F/R) [7] which have the following forms:
where ym = gm+1 - gm and the symbol ∥ .∥ means the Euclidean norm.
The parameter βm can be chosen in a variety of ways. For preconditioned conjugate gradient algorithm, we can choose from the choices below:
Limited-memory quasi-Newton methods have previously been studied by various authors, for example, as memory-less quasi-Newton methods by Shanno [8], limited-memory BFGS (L-BFGS) by Nocedal [9], and also many of studies focused on limited memory quasi-Newton methods are concentrate on the limited memory BFGS (L-BFGS) method [10–13].
The methods are suitable for large scale problems because the amount of storage required by the algorithms can be controlled by the user. The aim of this paper is to propose two new limited-memory preconditioned conjugate gradient algorithms for solving nonlinear and unconstrained optimization problems because this class of methods is a good efficiency task to solve such problems.
Projected Quasi-Newton methods
The matrix of the symmetric rank one (SR1),
and the matrix of (DFP),
have the property that Hm+1yj = vj, where m - M ⩽ m - 1 and it is symmetric, where m is the number of iteration and M is a memory index number.
At each iteration we can transform the two sequences of vectors {vj } m-M⩽j⩽m-1 and {yj } m-M⩽j⩽m-1 in to and , so that for m - M ⩽ i ≠ j ⩽ m - 1,, in order to satisfy the quasi-Newton condition Hmym-1 = vm-1, we let and .
This transformation is easily realized by the following Gram-Schmidt process.
The paper is organized as follows: in section 3, we will suggest a new modified symmetric rank one (NMSR1) and a new modified Davidon, Fletcher, and Powell (NMDFP). Section 4 contains the outlines of the new limited memory preconditioned conjugate gradient methods (New1 and New2), these algorithms combine the methods NMSR1 and NMDFP with conjugate gradient methods, also using projection and limited memory, to solve nonlinear unconstrained minimization problems. Then, in section 5, we will study the fulfillment of the quasi-Newton condition and the positive definite condition for NMSR1, as well as the positive definite condition for NMDFP and the descent condition of this new technique. While section 6 includes some numerical results to show the efficiency of these new algorithms (New1 and New2).
Derivative of new modified SR1 and new modified DFP methods
In this section, we will suggest a new modified symmetric one (NMSR1) and a new modified Davidon, Fletcher, and Powell (NMDFP) for unconstrained minimization problems.
First we propose the following equation as a new quasi-Newton equation
and we suppose that
Where ym = gm+1 - gm, vm = xm+1 - xm, gm = ∇ f (xm), gm+1 = ∇ f (xm+1) and δ1 > 0.
Now, Let
where αm ∈ R and wm ∈ Rn give Hm, ym and so that the required relationship is satisfied.
To find αm and wm, multiply both sides of Equation(21) by ym and by using (19), we have
where the constants b1, b2 and the vectors r1 and r2 are to be determined.
Here,
by forcing (31) to satisfy the new quasi-Newton condition (19) multiply both sides of Equation (31) by ym we get,
and by Equation (19) we have,
where and can be identified as scalar. Although the vectors r1 and r2 in (32) are not unique, so we chose , r2 = Hmym, which make (32) to be satisfied, when and .
Here, the rank two formula can be expressed as the NMDFP formula
Algorithms of the new limited memory preconditioned conjugate gradient methods (New1 and New2)
Let M ⩾ 0 be a memory index number, and assume that a positive definite symmetric matrix H0 is given. During the first M iterative steps, we set and ym+1 = [y0, …, ym].
But for m ⩾ M, we only use the latest M steps information, which are and ym+1 = [ym-M+1, …, ym]
Algorithm of the New 1
Set m = 0, chose x0 ∈ Rn, M, a real symmetric positive definite H0 = I, and ɛ is a termination scalar and Compute g0.
Now, to complete the proof, we need to prove that
consider,
where δ1 > 0 and , under the condition, if Sm < 0, then take, and we know that is positive, then, .
By Cauchy Schwarz the first term in the right side of the Equation (47) is nonnegative and the second term is nonnegative also. To complete the proof, we need to show that at lest one of the terms is not equal to zero. Only when if a1 and a2 are proportional does the first term disappear, that is if a1 = β1a2 for some scalar β1 . Therefore, to complete the proof, it is only to prove that if a1 = β1a2, then,
Observe that,
hence, X = β1ym
here, we obtain
and in the theorem 2 we proved that .
Thus, for all for X ≠ 0
■
Theorem 4.Suppose that the sequence {xm} is generated by (2), then the search direction in (3) with conjugate gradient (14) and with (29) or (33) satisfy the descent condition, i.e. with exact and inexact line search.
where they were proved in the Theorems 2 and 3 that they are positively defined.
Multiplying both sides of Equation (48) by gm+1, to obtain
Here, we will take two cases, the first case is exact line search, that is mean , then, we get
The second case is inexact line search, that is mean , here, we will prove by mathematical induction, , where d0 = - Hg0, and we assume that it is true for case m that means . To prove case m + 1, by the inequality
In this section, we focus on testing the implementation of the new algorithms. We compare New1, which is the new limited memory preconditioned conjugate gradient method (LMPCG), where it contains (NMSR1/DY) with the standard LMPCG (SR1/DY), and New2, which is the new LMPCG, where it contains (NMDFP/HS) with the standard LMPCG (DFP/HS). The comparative tests contain nonlinear unconstrained problems with different dimensions n = 50, 100 and 500. Different numbers were used M = 3, 4, 6, 7 and 9 as a memory index. We considered ɛ = 10-5 and stopping criteria is set to ∥gm+1 ∥ < 10-5. Tables (1) and (2) of the findings provide specific references to the number of iterations NOI and the number of functions NOF. The results are in the two tables below confirm that the algorithms New1 and New2 are superior to standard L-PCG methods with respect to NOI and NOF. We adopt the performance profiles by Dolan and More [16] to compare the performance among different methods. Figures 1 and 2 shows the performance profiles of the New1 based on NOI and NOF respectively, and we see that the New1 is more performant than the standard LMPCG(SR1/DY). While the Figs. 3 and 4 shows the performance profiles of the New2 based on NOI and NOF respectively, and we see that the New2 is more performant than the standard LMPCG(DFP/HS).
Comparing the Performance of the Two Algorithms (Standard LMPCG (SR1/DY)) and New1 (New LMPCG (NMSR1/DY))
Test
Dimensions
Standard LMPCG
New1 (New LMPCG
Functions
(n)
(SR1/DY)
(NMSR1/DY))
M
NOI
NOF
NOI
NOF
G-Central
50
3
111
708
83
579
4
111
708
83
579
6
111
708
83
579
7
111
708
83
579
9
111
708
83
579
100
3
115
754
91
669
4
115
754
91
669
6
115
754
91
669
7
115
754
91
669
9
115
754
91
669
500
3
134
944
105
839
4
134
944
105
839
6
134
944
105
839
7
134
944
105
839
9
134
944
105
839
Cubic
50
3
47
122
38
104
4
47
122
38
104
6
47
122
38
104
7
47
122
38
104
9
47
122
38
104
100
3
65
165
44
114
4
65
165
44
114
6
65
165
44
114
7
65
165
44
114
9
65
165
44
114
500
3
53
138
32
88
4
53
138
32
88
6
53
138
32
88
7
53
138
32
88
9
53
138
32
88
Beal
50
3
26
59
16
40
4
26
59
16
40
6
26
59
16
40
7
26
59
16
40
9
26
59
16
40
100
3
26
59
17
47
4
26
59
17
47
6
26
59
17
47
7
26
59
17
47
9
26
59
17
47
500
3
26
59
16
44
4
26
59
16
44
6
26
59
16
44
7
26
59
16
44
9
26
59
16
44
Fred
50
3
353
926
211
526
4
353
926
211
526
6
353
926
211
526
7
353
926
211
526
9
353
926
211
526
100
3
347
892
265
729
4
347
892
265
729
6
347
892
265
729
7
347
892
265
729
9
347
892
265
729
500
3
311
813
31
78
4
311
813
31
78
6
311
813
31
78
7
311
813
31
78
9
311
813
31
78
Rosen
50
3
41
109
38
101
4
41
109
38
101
6
41
109
38
101
7
41
109
38
101
9
41
109
38
101
100
3
41
108
38
102
4
41
108
38
102
6
41
108
38
102
7
41
108
38
102
9
41
108
38
102
500
3
44
115
41
107
4
44
115
41
107
6
44
115
41
107
7
44
115
41
107
9
44
115
41
107
OSP
50
3
42
165
40
135
4
42
165
40
135
6
42
165
40
135
7
42
165
40
135
9
42
165
40
135
100
3
44
2832
86
307
4
44
2832
86
307
6
44
2832
86
307
7
44
2832
86
307
9
44
2832
86
307
500
3
453
1499
402
1372
4
453
1499
402
1372
6
453
1499
402
1372
7
453
1499
402
1372
9
453
1499
402
1372
RECIP
50
3
15
45
15
38
4
15
45
15
38
6
15
45
15
38
7
15
45
15
38
9
15
45
15
38
100
3
15
45
15
39
4
15
45
15
39
6
15
45
15
39
7
15
45
15
39
9
15
45
15
39
500
3
15
45
15
41
4
15
45
15
41
6
15
45
15
41
7
15
45
15
41
9
15
45
15
41
TRI
50
3
355
711
322
645
4
355
711
322
645
6
355
711
322
645
7
355
711
322
645
9
355
711
322
645
100
3
746
1493
671
1343
4
746
1493
671
1343
6
746
1493
671
1343
7
746
1493
671
1343
9
746
1493
671
1343
500
3
3319
6639
3250
6501
4
3319
6639
3250
6501
6
3319
6639
3250
6501
7
3319
6639
3250
6501
9
3319
6639
3250
6501
SUM
50
3
11
63
10
52
4
11
63
10
52
6
11
63
10
52
7
11
63
10
52
9
11
63
10
52
100
3
14
86
11
67
4
14
86
11
67
6
14
86
11
67
7
14
86
11
67
9
14
86
11
67
500
3
24
128
15
80
4
24
128
15
80
6
24
128
15
80
7
24
128
15
80
9
24
128
15
80
Wolfe
50
3
100
201
92
185
4
100
201
92
185
6
100
201
92
185
7
100
201
92
185
9
100
201
92
185
100
3
111
223
95
191
4
111
223
95
191
6
111
223
95
191
7
111
223
95
191
9
111
223
95
191
500
3
107
216
103
208
4
107
216
103
208
6
107
216
103
208
7
107
216
103
208
9
107
216
103
208
Comparing the performance of the two algorithms (Standard LMPCG (DFP/HS) and New2 (New LMPCG (NMDFP/HS))
Test
Dimensions
Standard LMPCG
New2 (New LMPCG
Functions
(n)
(DFP/HS)
(NMDFP/HS))
M
NOI
NOF
NOI
NOF
G-Central
50
3
51
391
52
356
4
33
220
16
94
6
44
301
14
87
7
37
231
66
439
9
63
364
74
358
100
3
51
391
12
71
4
33
220
30
208
6
48
347
20
105
7
39
261
36
234
9
67
412
36
234
500
3
51
391
21
140
4
33
220
22
154
6
57
463
22
147
7
40
275
22
147
9
85
564
22
147
Miele
50
3
30
90
24
73
4
36
124
22
71
6
50
167
29
104
7
46
138
29
104
9
26
112
29
104
100
3
39
121
31
99
4
29
89
21
68
6
56
205
27
98
7
36
112
32
92
9
33
160
48
163
500
3
37
116
30
96
4
52
197
39
134
6
61
237
50
209
7
35
106
55
211
9
42
217
51
199
Powell
50
3
34
107
43
115
4
54
158
44
112
6
67
199
57
153
7
64
181
38
95
9
65
191
55
154
100
3
34
110
34
87
4
60
169
37
96
6
74
199
43
139
7
75
207
41
104
9
100
266
50
139
500
3
44
121
39
111
4
78
217
42
105
6
59
164
51
162
7
76
205
46
121
9
95
265
51
130
Wood
50
3
237
710
84
207
4
719
1937
60
142
6
198
596
158
483
7
179
532
163
509
9
153
480
168
525
100
3
376
1134
75
180
4
859
2333
521
1455
6
298
873
472
1260
7
276
814
284
866
9
269
844
207
602
500
3
934
2768
31
69
4
1566
4427
30
68
6
834
2718
34
79
7
880
2622
36
83
9
859
2610
38
88
Cubic
50
3
43
112
40
100
4
43
106
34
91
6
38
105
22
68
7
64
178
21
62
9
57
152
28
92
100
3
49
123
41
97
4
51
124
34
84
6
39
105
25
69
7
72
175
21
61
9
56
151
28
90
500
3
46
117
39
93
4
52
136
41
98
6
54
139
26
72
7
335
828
28
73
9
321
810
24
66
Rosen
50
3
144
446
88
232
4
151
469
30
80
6
153
472
29
78
7
155
475
35
94
9
505
1464
42
113
100
3
242
702
115
274
4
232
676
29
74
6
250
735
30
84
7
226
639
34
89
9
1590
4308
37
99
500
3
548
1426
32
86
4
550
1445
32
89
6
534
1365
34
90
7
539
1415
33
89
9
1315
4185
36
94
Shallow
50
3
9
27
9
24
4
9
27
9
24
6
9
27
9
24
7
9
27
9
24
9
9
27
9
24
100
3
10
29
9
24
4
10
29
9
24
6
10
29
9
24
7
10
29
9
24
9
10
29
9
24
500
3
10
29
10
26
4
10
29
10
26
6
10
29
10
26
7
10
29
10
26
9
10
29
10
26
Non-diagonal
50
3
50
119
46
107
4
51
118
45
138
6
46
109
41
105
7
45
105
25
81
9
38
93
21
73
100
3
69
164
44
110
4
71
171
48
124
6
69
166
44
109
7
73
173
39
102
9
59
143
44
112
500
3
79
189
48
115
4
79
190
33
90
6
79
189
39
101
7
72
178
40
101
9
72
174
37
106
TRI
50
3
44
89
44
89
4
44
89
44
89
6
44
89
44
89
7
44
89
44
89
9
44
89
44
89
100
3
71
143
71
143
4
71
143
71
143
6
71
143
71
143
7
71
143
71
143
9
71
143
71
143
500
3
188
377
188
377
4
188
377
188
377
6
188
377
188
377
7
188
377
188
377
9
188
377
188
377
Wolfe
50
3
53
107
51
103
4
54
109
51
103
6
57
115
56
113
7
58
117
57
115
9
60
121
58
117
100
3
77
155
62
125
4
83
167
69
139
6
92
185
81
163
7
94
189
84
169
9
96
193
90
181
500
3
88
177
73
147
4
95
191
90
184
6
106
213
87
178
7
112
225
90
184
9
124
249
98
200
Performance based on the number of iterations.
Performance based on the number of functions.
Performance based on the number of iterations.
Performance based on the number of functions.
Conclusion
We have presented two new limited-memory preconditioned conjugate gradient algorithms to find the minimum of the nonlinear optimization problems. New1 which is the new LMPCG, by using (NMSR1/DY), and New2 which is the new LMPCG, by using (NM DFP/HS). Projected vectors and also are used. The quasi-Newton condition and the positive definite condition of the NMSR1 method have been proved, also the positive definite of the NMDFP method has been proved. Furthermore, we have also proved that the new techniques are satisfied the descent condition. The numerical tests were conducted on problems with low and high dimensionality, with comparisons made between different test functions. Also, different numbers were used as a memory index. Through the numerical results in the Tables 1 and 2, and also through the Figs. 1, 2, 3 and 4, we conclude that the new algorithms are more efficient than standard methods.
References
1.
EdwinK.P. and Stanislaw żakH., An Introduction to Optimization/Second Edition., Printed in the United States of America (2001).
2.
YangY.T., X.C.X. and G.Y.L., G.Y.L., An optimal self-scaling strategy to the modified symmetric rank one updating, J Xian Jiaotong Univ34 (2005), 100–103.
3.
SunL.P., Updating of the self-scaling symmetric rank one algorithm with limited memory for large-scale unconstrained optimization, Comput Optim Appl27 (2004), 23–29.
4.
HestenesM.R. and StiefelnE., Method of conjugate gradient for solving linear equations, J Res Nat Bur Stand49 (1952), 409–436.
5.
PolakE. and RibièreG., Note sur la convergence de directions conjugees, Rev Fr Inform Rech Oper3 (1969), 35–43.
6.
DaiY.H. and YuanY., A nonlinear conjugate gradient with a strong global convergence properties, SIAM J Optim10 (1999), 177–182.
7.
FletcherR. and ReevesC.M., Function minimization by conjugate gradients, Comput J7 (1964), 149–154.
8.
ShannoD.F., Conjugate gradient methods with inexact searches, Math Oper Res3 (1978), 244–256.
9.
NocedalJ., Updating quasi-Newton matrices with limited storage, Math Comp35 (1980), 773–782.
10.
Dong LiuC. and Jorge Nocedal, On the limited memory BFGS method for large scale optimization, Math Program45 (1989), 503–528.
11.
MoralesJ.L., A Numerical Study of Limited Memory BFGS methods, Appl Math Lett15 (2002), 481–487.
12.
Al-BaaliM., Extra updates for the BFGS method, Optim Methods Softw13 (2000), 159–179.
13.
YangY.T. and XuC.X., A compact limited memory method for large scale unconstrained optimization, Eur J Oper Res180 (2007), 48–56.
14.
StoreyY.F.H., and C., Projected Quasi-Newton Updates, Math Rep A144, Dep Math Sci Loughbrgh, (1991).
15.
Zhen-Jun Shi and Jie Shen, Step-size Estimation for Unconstrained Optimization Methods, Comput Appllied Math24 (2005), 399–416.
16.
DolanE.D., and M., Benchmarking optimization software with performance profiles, Math Program91 (2002), 201–213.