MLflow to extend Kubernetes support in next release

MLflow to extend Kubernetes support in next release

MLflow should deliver extended Kubernetes support in its next release, after its 1.0 release boosted Docker support last week.

Last year Databricks cofounder and chief technologist Mattei Zaharia told Devclass that Kubernetes and Windows support were key targets for the 1.0 release.

Windows support duly came in last week’s 1.0 release – just squeaking in under the firm’s self-imposed deadline – along with an  experimental CLI command that enables users to build a Docker image capable of serving an MLflow model. However, there was no explicit mention of Kubernetes.

Speaking to Devclass this week databricks director of product management, Clemens Mewald, said “Kubernetes support is a top item for the community.”

He said that when it came to deploying models on Kubernetes clusters, “The most common way of doing this is to build a Docker container and deploy that on Kubernetes. The feature we added to easily build a Docker container from an MLflow model addresses that request.”

The second dimension to Kubernetes support, he continued, “is being able to deploy the MLflow server and the UI and all of its components onto Kubernetes and host it.”

“We’re actually working with community contributors on this right now,” he said. “There’s a big pull request that we’re reviewing to enable this functionality…..We’re basically reviewing and deciding that the design decisions are sound.”

This should be done within the next quarter he said. “Once it’s submitted [to GitHub] it’s available, but it just becomes a release when we start packing it up as a release.”

Clemens also fleshed out the company’s plans for a model registry, which he described as “The most important and biggest piece were prioritising next quarter.”

He said some customers had hundreds, even thousands of models. “You really need a good and principled way to manage that number of models….to be able to register your models, give them a name, and then your models are being versioned and then you can manage the lifecycle of the model.”

“The model registry allows our users to manage the lifecycle and in some cases we’ve seen companies apply as much rigour to this process,” he continued, “So integrating this process with CI/CD systems, and have approval gates in there and so on.

Hard on the heels of that, he said, would be the capability to express multi-step workflows, which the company recently demoed. “Often the code our users run to train models is not just one step, it’s multiple steps – basically transforming data, training your model, running evaluations and so on, and a lot of our users want to express those steps in MLflow – that’s something that’s farther out and probably going to happen later this year.”