logo资料库

Hessian矩阵与多元函数极值Second-order sufficient optimality conditions.pd....pdf

第1页 / 共3页
第2页 / 共3页
第3页 / 共3页
资料共3页,全文预览结束
Name: aqSeabiscuit Date: 2020/4/14 Second-order sufficient optimality conditions Problem: Suppose that f : Rn → R is twice differentiable at x. Please show that x is a strict local minimum if ∇f (x) = 0 and the Hessian matrix H(x) is positive definite. Solve: First consider the situation where n = 2, then the function f : Rn → R can be written as f (x1, x2), x1, x2 ∈ R2. Suppose that f is twice differentiable at ˆx, where ˆx is ( ˆx1, ˆx2), then we can expand the function f (x) near ˆx with the help of Taylor series up to second-order. [1] It can be given as f (x) = f (ˆx) + + ∂f ∂x1 1 2 ˆx ∂2f ∂x2 1 ∆x1 + ∂f ∂x2 ˆx ∆x2 ∂2f ∆x2 1 + 2 ˆx ∂x1∂x2 ˆx ∆x1∆x2 + ∂2f ∂x2 2 ∆x2 2 ˆx where ∆x1 = x1 − ˆx1, ∆x2 = x2 − ˆx2. Turn the Taylor expansion above into a matrix form, then 1A 0@ ∆x1 264 ∂2f ∆x1 ∂x2 1 ∂2f ˆx 375 0@ ∆x1 1A ∆x1 ˆx ∂2f ∂x1∂x2 ∂2f ∂x2 2 f (x) = f (ˆx) + ∂f ∂x1 ∂f ∂x1 + 1 2 (∆x1 ∆x2) Equivalently, it’s vector form can also be given as ∂x2∂x1 (1) (2) (3) (4) (5) where f (x) = f (ˆx) + ∇f (ˆx) ∆x + ∆xT H (ˆx) ∆x 1 2 h ∆x = [∆x1 ∆x2]T ∇f (ˆx) = ∂f ∂x1 ∂f ∂x2 i 1
Name: aqSeabiscuit 264 ∂2f ∂x2 1 ∂2f ∂x2∂x1 375 ˆx ∂2f ∂x1∂x2 ∂2f ∂x2 2 H (ˆx) = Date: 2020/4/14 (6) H (ˆx) is the second-order Hessian matrix of the function f at ˆx. Similarly, we can generalize it for an arbitrary n. For f : Rn → R, second-order Taylor expansion at ˆx in a vector form can be given as the same as the formula (3) above, where h h ∆x1 ∆x2 ∆x = ∇f (ˆx) = ∂f ∂x1 ∂f ∂x2 iT i ··· ∆xn ··· ∂f ∂xn ∂2f ∂x2 1 ∂2f ∂x2∂x1 ... ∂2f ∂2f ∂x1∂x2 ∂2f ∂x2 2 ... ∂2f ∂xn∂x1 ∂xn∂x2 ··· ··· ... ··· ∂2f ∂x1∂xn ∂2f ∂x2∂xn ... ∂2f ∂x2n 266666664 H (ˆx) = (7) (8) (9) 377777775 ˆx Since f is twice differentiable at ˆx, it has nothing to do with the order of differentiation while calculating the second order partial derivative, that is ∂ ∂xi ∂f ∂xj = ∂ ∂xj ∂f ∂xi (10) for Hessian matrix, we have Hij = Hji, so H (ˆx) is a symmetric matrices. Let f (x) = X T AX, where A is a symmetric matrices and X = (x1, x2,··· , xn)T . By the definition of the positive definite quadratic form, if matrix A is positive definite, then for all x ∈ Rn, X ̸= 0, we have [2] f (x) = X T AX > 0 (11) Then, let g (x) = ∆xT H (ˆx) ∆x, where ∆x, H (ˆx) are as the same as (7)(9). Suppose that H (ˆx) is positive definite, we can conclude from (11) that for all x ∈ Rn, x ̸= ˆx, we have g (x) = ∆xT H (ˆx) ∆x > 0 (12) 2
Name: aqSeabiscuit Date: 2020/4/14 According to (3), we can deduce the formula as below f (ˆx + ηv) ) = f (ˆx) + η∇f (ˆx) vT + η2 2 vT H (ˆx) v where v ∈ Rn,|v| ̸= 0, and η is a constant called learning rate. [3] Suppose that ∇f (ˆx) = 0 From (12)(13)(14), we can get (13) (14) f (ˆx + ηv) ) = f (ˆx) + η2 2 vT H (ˆx) v > f (ˆx) , ∀v ∈ Rn,|v| ̸= 0 (15) Obviously, ˆx is the strict local minimizer. So, we can draw a conclusion: if f : Rn → R is twice differentiable at x, then the sufficient conditions for that x is a strict local minimizer are the Hessian matrix H(x) is positive definite and ∇f (x) = 0. References [1] T. U. Department of Mathematics, Advanced Mathematics (7th Editiion). [2] ——, Linear Algebra (6th Edition). [3] W. Jianxin, Pattern recognition. 3
分享到:
收藏