MLflow has hit v1.0, a year after the first launch of the machine learning management project and just inside its self-imposed deadline.
MLflow project was launched by Databricks last June, as a way of managing the machine learning lifecycle and helping data scientists track and share their experiments.
The project had been gunning for a full 1.0 designation in the first half of 2019, and yesterday announced general availability of just that. In a blogpost director of product management Clemens Mewald and Databricks co-founder Matei Zaharia pointed out a raft of new features.
Top of the list is support for “recording, querying, and visualizing metrics along a new “step” axis (x coordinate), providing increased flexibility for examining model performance relative to training progress.”
Search functionality has been improved, with the search filter API supporting a simplified version of the SQL WHERE clause, and supporting searching by run attributes and tags in addition to metrics and parameters.
A runs/log-batch REST API endpoint for logging multiple metrics, parameters, and tags with a single API request, should make it easier to log metrics as a batch.
Hadoop Distributed File System (HDFS) has been added to the supported storage backends, alongside Amazon S3, Azure Blob Storage, Google Cloud Storage, SFTP, and NFS.
Windows support has been extended as promised back in October, and Microsoft users can now track experiments with the MLflow 1.0 Windows client.
More seamless Kubernetes support had also been mentioned as an aim by Zaharia back in October. However, there was no mention of Kubernetes in the 1.0 release, though particularly keen users still have the option of working around it.
Meanwhile, a new (experimental) CLI command enables users to build a Docker image capable of serving an MLflow model.
Looking ahead, Mewald and Zaharia said “We are also investing in new components to cover more of the ML lifecycle. The next major addition to MLflow will be a Model Registry that allows users to manage their ML model’s lifecycle from experimentation to deployment and monitoring.
There are a large number of major breaking changes, as well as other fixes and features listed here.
Earlier this week, mega consultancy McKinsey’s QuantumBlack arm launched its Kedro machine learning framework. Speaking to Devclass, product manager Yetunde Dada described MLflow as “the application of one software engineering principle…this whole ability to do versioning.”
Kedro, she said, “Is essentially the application of modularity which is being able to split your code base into small chunks so it’s easy to test.”