Trust region methods: global/local convergence, approximate January methods 24, / 15

Trust region methods: global/local convergence, approximate methods January 24, 2014 Trust region methods: global/local convergence, approximate January methods 24, 2014 1 / 15

Trust-region idea Model m k f ( ): f (x k + p) m k (p) = f (x k ) + gk T p + 1 2 pt B k p where g k = f (x k ), B k { I, 2 f (x k ), 2 f (x k ) }. Trust-region subproblem: minimize m k (p), subject to p k. Trust region methods: global/local convergence, approximate January methods 24, 2014 2 / 15

Trust-region algorithm: 1: while g k ɛ do 2: p k := approximate solve of TR subproblem 3: Reduction: ρ k := f (x k) f (x k +p k ) m(0) m(p k ) 4: if ρ k < 1/4 then 5: k+1 = k /4 6: else if ρ k > 3/4 & p k = k then 7: k+1 = 2 k 8: else 9: k+1 = k 10: end if 11: if ρ k > η then 12: Accept step: x k+1 = x k + p k 13: else 14: Reject step: x k+1 = x k 15: end if 16: end while Trust region methods: global/local convergence, approximate January methods 24, 2014 3 / 15

Local convergence of TR-Newton algorithm Theorem 4.9: superlinear convergence of TR-Newton f C 2, 2 f -Lipschitz x k x, x satisfies 2nd order suff. cond. B k = 2 f (x k ), Newton step p N k = 2 f (x k ) 1 f (x k ) If p N k k/2 = p k p N k = o( pn k ) Then trust-region bound is inactive for all large k and x k+1 x lim k x k x = 0. Trust region methods: global/local convergence, approximate January methods 24, 2014 4 / 15

When can we expect convergence? When the subproblem is solution beats the steepest descent step (the Cauchy point p C ). Trust region methods: global/local convergence, approximate January methods 24, 2014 5 / 15

Reminder: Cauchy point calculation Cauchy point is the minimizer of m k ( αg k ), 0 α k g k. 1: if g T k B kg k 0 then 2: p C k = k g k g k 3: else 4: α = g k 2 g T k B kg k 5: if αg k k then 6: p C k = αg k 7: else 8: p C k = k g k g k 9: end if 10: end if Trust region methods: global/local convergence, approximate January methods 24, 2014 6 / 15

Global convergence of TR-algorithms Theorem 4.5, 4.6 Then: f C 2, 2 f -Lipschitz, f -bounded from below B k β, k p k k : m(p k ) m(pk C ) [can be relaxed, significantly!] 1 If any reduction is accepted by TR-alg (any ρ k > 0) = lim inf k g k = 0; 2 If only sufficient reduction is accepted (ρ k > η, η (0, 1/4)) = lim k g k = 0. Trust region methods: global/local convergence, approximate January methods 24, 2014 7 / 15

Cauchy-point based TR algorithms: Steepest descent Dogleg 2D-minimization Trust region methods: global/local convergence, approximate January methods 24, 2014 8 / 15

Steepest descent: p k = pk C Good: Cheap (explicit formula for p k ) Provably convergent Works when B k 0! Bad: SLOW! Trust region methods: global/local convergence, approximate January methods 24, 2014 9 / 15

Dogleg: Minimize m k over straght line segments: [0, g k 2 gk T B g k ] [ g k 2 kg k gk T B g k, B 1 kg k k g k]. Good: Relatively cheap (one linear system solve/tr subproblem) Provably convergent (improves upon Cauchy point) Superlinearly locally convergent (Newton steps are taken, whenever feasible) Bad: does not work when B k 0 e.g. can revert to Cauchy points in this case. Trust region methods: global/local convergence, approximate January methods 24, 2014 10 / 15

Dogleg: Path: p(τ) = p U = g 2 g T Bg g pb = B 1 g { τp U, 0 τ 1, p U + (τ 1)(p B p U ), 1 τ 2. Lemma 4.2 B k 0. Then 1 τ p(τ) monotonicaly non-decreasing; (increasing when p U 0 and p B p U ); 2 τ m( p(τ)) monotonically non-increasing. Trust region methods: global/local convergence, approximate January methods 24, 2014 11 / 15

Dogleg: 1: if B k 0 then 2: p k = p C k 3: else 4: if p U k k then 5: p k = pk C 6: else g k 7: pk B = B 1 k 8: if pk B k then 9: p k = pk B 10: else 11: Solve quadratic eqn p(τ) 2 2 = 0 12: end if 13: end if 14: end if Trust region methods: global/local convergence, approximate January methods 24, 2014 12 / 15

Generalization of dogleg: 2D minimization Suppose: B 0. Search for p(η 1, η 2 ) = η 1 p U + η 2 p B, p. 2D subspace includes p C and Newton step p B = Provably globally convergent Superlinearly locally convergent m(η 1, η 2 ) = m(p(η 1, η 2 )) Trust region methods: global/local convergence, approximate January methods 24, 2014 13 / 15

2D minimization: solving the subproblem Suppose: B 0. Min attained in the interior of the trust-region η m(η 1, η 2 ) = 0 If not = min attained on the boundary. 1D-minimization over parametrized ellipse η 2 1 p U 2 + η 2 2 p B 2 + 2η 1 η 2 (p U ) T p B = 2. Trust region methods: global/local convergence, approximate January methods 24, 2014 14 / 15

Generalization of dogleg: 2D minimization Suppose: B 0 [Byrd, Schnabel, Schultz]. Estimate the smallest eig λ 1 (B) and put α λ 1, α > λ 1. p B = (B + αi ) 1 g If p B > solve 2D minimization problem over span( g, p B ) precisely as in the case B 0. If p B compute eigenvector v 1. Put p = p B + ξv 1, where ξ: p = and ξv T 1 p B 0. (Read hard case, pp. 87 88). Trust region methods: global/local convergence, approximate January methods 24, 2014 15 / 15