Alternating Direction Method of Multipliers
Prof S. Boyd
EE364b, Stanford University
source:
Distributed Optimization and Statistical Learning via the Alternating
Direction Method of Multipliers (Boyd, Parikh, Chu, Peleato, Eckstein)
1
Goals
robust methods for
◮ arbitrary-scale optimization
– machine learning/statistics with huge data-sets
– dynamic optimization on large-scale network
◮ decentralized optimization
– devices/processors/agents coordinate to solve large problem, by passing
relatively small messages
2
Outline
Dual decomposition
Method of multipliers
Alternating direction method of multipliers
Common patterns
Examples
Consensus and exchange
Conclusions
Dual decomposition
3
Dual problem
◮ convex equality constrained optimization problem
minimize
subject to Ax = b
f (x)
◮ Lagrangian: L(x, y) = f (x) + yT (Ax − b)
◮ dual function: g(y) = inf x L(x, y)
◮ dual problem: maximize g(y)
◮ recover x⋆ = argminx L(x, y⋆)
Dual decomposition
4
Dual ascent
◮ gradient method for dual problem: yk+1 = yk + αk∇g(yk)
◮ ∇g(yk) = A˜x − b, where ˜x = argminx L(x, yk)
◮ dual ascent method is
xk+1
yk+1
:= argminx L(x, yk)
:= yk + αk(Axk+1 − b)
// x-minimization
// dual update
◮ works, with lots of strong assumptions
Dual decomposition
5
Dual decomposition
◮ suppose f is separable:
f (x) = f1(x1) + · · · + fN (xN ),
x = (x1, . . . , xN )
◮ then L is separable in x: L(x, y) = L1(x1, y) + · · · + LN (xN , y) − yT b,
Li(xi, y) = fi(xi) + yT Aixi
◮ x-minimization in dual ascent splits into N separate minimizations
xk+1
i
:= argmin
xi
Li(xi, yk)
which can be carried out in parallel
Dual decomposition
6
Dual decomposition
◮ dual decomposition (Everett, Dantzig, Wolfe, Benders 1960–65)
xk+1
i
yk+1
:= argminxi Li(xi, yk),
:= yk + αk(PN
i=1 Aixk+1
i = 1, . . . , N
i − b)
◮ scatter yk; update xi in parallel; gather Aixk+1
i
◮ solve a large problem
– by iteratively solving subproblems (in parallel)
– dual variable update provides coordination
◮ works, with lots of assumptions; often slow
Dual decomposition
7
Outline
Dual decomposition
Method of multipliers
Alternating direction method of multipliers
Common patterns
Examples
Consensus and exchange
Conclusions
Method of multipliers
8