Title: Stochastic conditional gradient++

Authors: H. Hassani, A. Karbasi, A. Mokhtari, and Z. Shen

Abstract

In this paper, we consider the general non-oblivious stochastic optimization where the underly- ing stochasticity may change during the optimization procedure and depends on the point at which the function is evaluated. We develop Stochastic Frank-Wolfe++ (SFW++), an efficient variant of the conditional gradient method for minimizing a smooth non-convex function subject to a convex body constraint. We show that SFW++ converges to an ǫ-first order stationary point by using O(1/ǫ3) stochastic gradients. Once further structures are present, SFW++’s theoretical guarantees, in terms of the convergence rate and quality of its solution, improve. In particular, for minimizing a convex function, SFW++ achieves an ǫ-approximate optimum while using $O(1/ǫ^2)$ stochastic gra- dients. It is known that this rate is optimal in terms of stochastic gradient evaluations. Similarly, for maximizing a monotone continuous DR-submodular function, a slightly different form of SFW++, called Stochastic Continuous Greedy++ (SCG++), achieves a tight [(1 − 1/e)OPT − ǫ] solution while using $O(1/ǫ^2)$ stochastic gradients. Through an information theoretic argument, we also prove that SCG++’s convergence rate is optimal. Finally, for maximizing a non-monotone continuous DR-submodular function, we can achieve a [(1/e)OPT − ǫ] solution by using $O(1/ǫ^2)$ stochastic gradients. We should highlight that our results and our novel variance reduction technique trivially extend to the standard and easier oblivious stochastic optimization settings for (non-)covex and continuous submodular settings.

Full Text: [PDF]

Accessibility at Yale   Inference, Information, and Decision Group at Yale