Lyft drives ML platform Flyte into the open

Lyft drives ML platform Flyte into the open

Rideshare company Lyft has open sourced its orchestration engine Flyte, the secret sauce of its machine learning pipeline management.

According to the project’s website, Flyte is meant to easily “create concurrent, scalable, and maintainable workflows for machine learning and data processing”. It also seems to be production tested, since the company claims to have used the platform for over three years in a number of teams from Estimated Time of Arrivals, to Self-Driving, and Pricing.

Looking at it a bit more closely, Flyte, like so many other platforms, aims at providing abstraction, so that devs can focus on business logic and don’t have to take care of the underlying infrastructure. It is designed as a multi-tenant system, allowing both isolated repositories as well as sharing workflows across tenants.

Workflows are composed as graphs that have to conform to Lyft’s specifications in order for all tasks to have data in a useable format available to them. This is essential given that Flyte is used for data heavy operations that can be quite compartmentalised. Since those also love to drain resources, the project dynamically provisions those for execution and frees them up as tasks are completed to keep costs low. A web interface is also available to let users track a workflow’s status.

To make sure any result is reproducible, the Flyte versions tasks and containerizes them together with their dependencies so executions can be reconstructed later and are consistent across environments. This also hints at the Flyte’s modular approach, since container images bound to special tasks grant users the possibility of combining very varied steps into complex workflows. 

Strong typing in a task’s or workflow’s inputs and outputs means parameterisation is an option, while there’s also a way of marking tasks as cacheable, so the results can be used at another point without the need to compute them again. All in the name of efficiency.

Lyft is most recognised in the open source community for the release of the Envoy proxy, which has since found adopters in companies like Google, AWS, and Netflix, and even made it through the CNCF graduation process. We’ll see if Flyte will be able to tie in with that success.