We consider nonsmooth optimization problems under affine constraints, where the objective consists of the average of the component functions of a large number $\n$ of agents, and we only assume access to the fenchel conjugate of the component functions. The method of choice for solving such problems is the dual subgradient, also known as dual decomposition, which converges in $O(\frac{1}{\epsilon^2})$ iterations in the convex case. However, each iteration requires computing the fenchel conjugate of each of the $\n$ agents, leading to a complexity $O(\frac{\n}{\epsilon^2})$ which might be prohibitive in practical applications. To overcome this, we suggest a two-stage algorithm, combining a stochastic subgradient algorithm on the dual problem, followed by a block-coordinate Frank-Wolfe algorithm to obtain primal solutions. The resulting algorithm requires only $O(\frac{1}{\epsilon^2} + \frac{\n}{\epsilon^{2/3}})$ calls to fenchel conjugates to obtain an $\epsilon$-optimal primal solution in the convex case. We extend our results to non-convex component functions and show that our method still applies and gets (almost) the same convergence rate, this time only to an approximate primal solution recovering the classical duality gap bounds usually obtained using the Shapley-Folkman theorem.

