logo资料库

Proximal gradient method.pdf

第1页 / 共58页
第2页 / 共58页
第3页 / 共58页
第4页 / 共58页
第5页 / 共58页
第6页 / 共58页
第7页 / 共58页
第8页 / 共58页
资料共58页,剩余部分请下载后查看
Algorithms for large-scale convex optimization — DTU 2010 3. Proximal gradient method • introduction • proximal mapping • proximal gradient method • convergence analysis • accelerated proximal gradient method • forward-backward method 3-1
Proximal mapping the proximal mapping (or proximal operator) of a convex function h is proxh(x) = argmin u h(u) + 2 1 2ku − xk2 examples • h(x) = 0: proxh(x) = x • h(x) = IC(x) (indicator function of C): proxh is projection on C proxh(x) = PC(x) = argmin u∈C ku − xk2 2 • h(x) = tkxk1: proxh is shrinkage (soft threshold) operation proxh(x)i =   xi − t xi ≥ t |xi| ≤ t 0 xi + t xi ≤ −t Proximal gradient method 3-2
Proximal gradient method unconstrained problem with cost function split in two components minimize f (x) = g(x) + h(x) • g convex, differentiable, with dom g = Rn • h closed, convex, possibly nondifferentiable; proxh is inexpensive proximal gradient algorithm x(k) = proxtkhx(k−1) − tk∇g(x(k−1)) tk > 0 is step size, constant or determined by line search Proximal gradient method 3-3
Interpretation x+ = proxth (x − t∇g(x)) from definition of proximal operator: x+ = argmin u h(u) + 2 1 2t ku − x + t∇g(x)k2 u h(u) + g(x) + ∇g(x)T (u − x) + = argmin 2 1 2tku − xk2 x+ minimizes h(u) plus a simple quadratic local model of g(u) around x Proximal gradient method 3-4
Examples minimize g(x) + h(x) gradient method: h(x) = 0, i.e., minimize g(x) x(k) = x(k−1) − tk∇g(x(k−1)) gradient projection method: h(x) = IC(x), i.e., minimize g(x) over C x(k) = PC x(k−1) − tk∇g(x(k−1)) Proximal gradient method 3-5
iterative soft-thresholding: h(x) = kxk1, i.e., minimize g(x) + kxk1 and x(k) = proxtkhx(k−1) − tk∇g(x(k−1)) proxth(u)i =   ui − t ui ≥ t 0 ui + t ui ≥ t −t ≤ ui ≤ t proxth(u)i −t t ui Proximal gradient method 3-6
Outline • introduction • proximal mapping • proximal gradient method • convergence analysis • accelerated proximal gradient method • forward-backward method
Definition proximal mapping associated with closed convex h proxh(x) = argmin u h(u) + 2 1 2ku − xk2 it can be shown that proxh(x) exists and is unique for all x subgradient characterization from optimality conditions of minimization in the definition: u = proxh(x) ⇐⇒ x − u ∈ ∂h(u) Proximal gradient method 3-7
分享到:
收藏