The Salesforce Einstein platform is used by internal developers to create predictive applications for Salesforce customers. The platform uses spark as its data processing engine, and runs a very large number of data flows with very large variance in size and complexity. Both large and complex flows, such as running modeling for a customer who needs tens of millions of entities scored, and small time sensitive flows, such as incrementally processing and scoring object changes for a customer with only thousands of entities in total, must be supported. These diverse and complex flows arise for every application added to Salesforce and the platform handles many applications magnifying the importance of appropriate scaling and time sensitivity. In this talk, I'll present how we handle that large amount of diversity in data flows while keeping cost to serve to a minimum. I will detail where and how we chose to leverage open source and where we decided it was important to implement our own solutions.