Back To Schedule
Friday, November 17 • 9:00am - 9:40am
Composable Parallel Processing in Apache Spark and Weld

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
The main reason people are productive writing software is composability: engineers can take libraries and functions written by other developers and easily combine them into a program. However, composability has taken a back seat in early parallel processing APIs. For example, composing MapReduce jobs required writing the output of every job to a file, which was both error-prone and slow. Apache Spark helped simplify cluster programming largely because it enabled efficient composition of parallel functions, leading to a large standard library and high-level APIs in various languages. In this talk, I'll explain how composability has evolved in Spark's newer APIs, and I’ll present Weld, a new research project I'm leading at Stanford to enable much richer composition of software on emerging parallel hardware (multicores, GPUs, etc). Systems like Weld and Spark will allow engineers to focus on building their application rather than the intricacies of parallel hardware, and might represent one of the best ways we have to tame the ever-diversifying hardware landscape.

avatar for Matei Zaharia

Matei Zaharia

Chief Technologist, Databricks
Matei Zaharia is an Assistant Professor of Computer Science at Stanford and Co-founder and Chief Technologist at Databricks. He started the Apache Spark project during his PhD at UC Berkeley, and has worked on other widely used open source data analytics and AI software including... Read More →

Friday November 17, 2017 9:00am - 9:40am PST