In this paper, we consider optimizing submodular functions that are drawn from some unknown distribution. This setting arises, e.g., in recommender systems, where the utility of a subset of items may depend on a user-specific submodular utility function. In modern applications, the ground set of items is often so large that even the widely used (lazy) greedy algorithm is not efficient enough. As a remedy, we introduce the problem of sublinear time probabilistic submodular maximization: Given training examples of functions (e.g., via user feature vectors), we seek to reduce the ground set so that optimizing new functions drawn from the same distribution will provide almost as much value when restricted to the reduced ground set as when using the full set. We cast this problem as a two-stage submodular maximization and develop a novel efficient algorithm for this problem which offers 1/2(1−1/e^2) approximation ratio for general monotone submodular functions and general matroid constraints. We demonstrate the effectiveness of our approach on several real-world applications where running the maximization problem on the reduced ground set leads to two orders of magnitude speed-up while incurring almost no loss.