We consider the problem of learning sparse rep- resentations of data sets, where the goal is to re- duce a data set in manner that optimizes mul- tiple objectives. Motivated by applications of data summarization, we develop a new model which we refer to as the two-stage submodu- lar maximization problem. This task can be viewed as a combinatorial analogue of repre- sentation learning problems such as dictionary learning and sparse regression. The two-stage problem strictly generalizes the problem of car- dinality constrained submodular maximization, though the objective function is not submodu- lar and the techniques for submodular maximiza- tion cannot be applied. We describe a continuous optimization method which achieves an approx- imation ratio which asymptotically approaches 1 1/e. For instances where the asymptotics do not kick in, we design a local-search algorithm whose approximation ratio is arbitrarily close to 1/2. We empirically demonstrate the effective- ness of our methods on two multi-objective data summarization tasks, where the goal is to con- struct summaries via sparse representative sub- sets w.r.t. to predefined objectives.