In this paper we study the adaptive complexity of submodular optimization. Informally, the adaptive complexity of a problem is the minimal number of sequential rounds required to achieve a constant factor approximation when polynomially-many queries can be executed in parallel at each round. Adaptivity is a fundamental concept that is heavily studied in computer science, largely due to the need for parallelizing computation. Somewhat surprisingly, very little is known about adaptivity in submodular optimization. For the canonical problem of maximizing a monotone submodular function under a cardinality constraint, to the best of our knowledge, all that is known to date is that the adaptive complexity is between 1 and Ω(n). Our main result in this paper is a tight characterization showing that the adaptive complexity ofmaximizingamonotonesubmodularfunctionunderacardinalityconstraintisΘ ̃(logn): • We describe an algorithm which requires O(log n) sequential rounds and achieves an ap- proximation that is arbitrarily close to 1/3; • We show that no algorithm can achieve an approximation better than O( 1 ) with fewer log n than O( log n ) rounds. log log n Thus, when allowing for parallelization, our algorithm achieves a constant factor approximation exponentially faster than any known existing algorithm for submodular maximization. Importantly, the approximation algorithm is achieved via adaptive sampling and comple- ments a recent line of work on optimization of functions learned from data. In many cases we do not know the functions we optimize and learn them from labeled samples. Recent results show that no algorithm can obtain a constant factor approximation guarantee using polynomially-many labeled samples as in the PAC and PMAC models, drawn from any distribu- tion [BRS17, BS17a]. Since learning with non-adaptive samples over any distribution results in a sharp impossibility, we consider learning with adaptive samples where the learner obtains poly(n) samples drawn from a distribution of her choice in every round. Our result implies that in the realizable case, where there is a true underlying function generating the data, Θ ̃(logn) batches of adaptive samples are necessary and sufficient to approximately “learn to optimize” a monotone submodular function under a cardinality constraint.