Title: Fast greedy algorithms in mapreduce and streaming

Authors: R. Kumar, B. Moseley, S. Vassilvitskii, and A. Vattani


Greedy algorithms are practitioners’ best friends—they are intu- itive, simple to implement, and often lead to very good solutions. However, implementing greedy algorithms in a distributed setting is challenging since the greedy choice is inherently sequential, and it is not clear how to take advantage of the extra processing power. Our main result is a powerful sampling technique that aids in parallelization of sequential algorithms. We then show how to use this primitive to adapt a broad class of greedy algorithms to the MapReduce paradigm; this class includes maximum cover and submodular maximization subject to p-system constraints. Our method yields efficient algorithms that run in a logarithmic num- ber of rounds, while obtaining solutions that are arbitrarily close to those produced by the standard sequential greedy algorithm. We begin with algorithms for modular maximization subject to a ma- troid constraint, and then extend this approach to obtain approxima- tion algorithms for submodular maximization subject to knapsack or p-system constraints. Finally, we empirically validate our algo- rithms, and show that they achieve the same quality of the solution as standard greedy algorithms but run in a substantially fewer num- ber of rounds.

Full Text: [PDF]

Accessibility at Yale   Inference, Information, and Decision Group at Yale