TITLE:
Scalability, Data Distribution and Usability of Machine Learning with GraphLab

ABSTRACT:
Today, machine learning (ML) methods play a central role in industry and
science. The growth of the Web and improvements in sensor data collection
technology have been rapidly increasing the magnitude and complexity of the ML
tasks we must solve. This growth is driving the need for scalable, parallel ML
algorithms that can handle Big Data. 

In this talk, we will focus on: 

1. Examining common algorithmic patterns in distributed ML methods.  
2. Developing data partitioning schema, for both graph and tabular data, to 
   enable distributed ML, while supporting these patterns.
3. Describing a computational frameworks for implementing these algorithms at
   scale, for both graph and tabular data.
4. Addressing a significant core challenge to large-scale ML -- enabling the 
   widespread adoption of machine learning beyond experts.

Our computational framework, GraphLab, has seen wide adoption in industry and
provides orders of magnitude performance improvements over existing frameworks. 

This talk represents joint work with Yucheng Low, Joey Gonzalez, Aapo Kyrola,
Jay Gu, Danny Bickson, Joseph Bradley and Tyler Johnson.