Mizan: Optimizing Graph Mining in Large Parallel Systems Panos Kalnis, KAUST Extracting information from graphs, from finding shortest paths to complex graph mining, is essential for many applications. Due to the shear size of modern graphs (e.g., social networks), processing must be done on large parallel computing infrastructures (e.g., the cloud). Earlier approaches relied on the MapReduce framework, which proved inadequate for graph algorithms. Recently, the message-passing model (e.g., Pregel) has emerged. Although the Pregel model has many advantages, it is agnostic to the graph properties and the architecture of the underlying computing infrastructure, leading to suboptimal performance. In this talk, I will present Mizan, a layer between the users' code and the computing infrastructure. Mizan considers the structure of the input graph and the architecture of the infrastructure in order to: (i) decide whether it is beneficial to generate a near-optimal partitioning of the graph in a pre-processing step, and (ii) choose between typical point-to-point message passing and a novel approach that puts computing nodes in a virtual overlay ring. We deployed Mizan on a small local Linux cluster, on the cloud (256 virtual machines in Amazon EC2), and on an IBM BlueGene/P supercomputer (1024 CPUs). Mizan executes common algorithms on very large graphs up to one order of magnitude faster than MapReduce-based implementations and up to 4 times faster than implementations relying on Pregel-like hash-based graph partitioning.