Title: Submodular Streaming in All Its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity

Authors: Ehsan Kazemi, Marko Mitrovic, Morteza Zadimoghaddam, Silvio Lattanzi , Amin Karbasi

Abstract

Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a mono- tone submodular function in the streaming set- ting with a cardinality constraint k. We first propose SIEVE-STREAMING++, which requires just one pass over the data, keeps only O(k) el- ements and achieves the tight 1/2-approximation guarantee. The best previously known stream- ing algorithms either achieve a suboptimal 1/4- approximation with ⇥(k) memory or the opti- mal 1/2-approximation with O(k log k) memory. Next, we show that by buffering a small fraction of the stream and applying a careful filtering pro- cedure, one can heavily reduce the number of adaptive computational rounds, thus substantially lowering the computational complexity of SIEVE- STREAMING++. We then generalize our results to the more challenging multi-source streaming setting. We show how one can achieve the tight 1/2-approximation guarantee with O(k) shared memory while minimizing not only the required rounds of computations but also the total number of communicated bits. Finally, we demonstrate the efficiency of our algorithms on real-world data summarization tasks for multi-source streams of tweets and of YouTube videos.

Full Text: [PDF]

Accessibility at Yale   Inference, Information, and Decision Group at Yale