MLflow smooths out natural language processing and Microsoft support in 1.8 release

MLflow smooths out natural language processing and Microsoft support in 1.8 release

Databricks’ machine learning platform MLflow has learned to play nice with AzureML and spaCy models, making v1.8 ready for downloading.

MLflow is an Apache License 2.0 protected open source project that can be used with a variety of machine learning libraries and comprises of a tracking API to log and compare experiment results, a code packaging format for reproducible runs, a model packaging format and tools for deploying models, as well as a centralised registry meant to help teams manage a model’s lifecycle.

For the latest release, committers have done quite a bit to make the platform work better with Microsoft’s stable of products. Version 1.8 for example provides an API to deploy MLflow models to Azure Machine Learning and finally lets users on Windows machines deploy models to the AWS SageMaker via the platform’s CLI, which was an issue before.

Those who don’t care too much for connections to Redmond, might be more interested to learn that MLflow now comes with a module to save and load models using popular natural language processing library spaCy, thus adding a bit more flexibility to the platform. 

Using MLflow with Docker should also have become a bit easier in this version, since it’s now possible to pass arguments to docker run, when running corresponding projects. The platform’s SearchRuns API as well as its UI also learned to recognise case-sensitive LIKE and case-insensitive ILIKE queries when running against a SQL backend, which can be used for pattern matching purposes. 

To have a better idea of the exact state an application is in, Databricks fitted the REST API server with a health check endpoint, which returns a 200 status code as long as the app in question is live. Better oversight when comparing runs meanwhile is supposedly provided by a newly added change highlighting, which makes varying parameter values in the CompareRun view of the platform more visible.

Apart from that, metrics UI plots can now handle more input points, since the team switched from scatter to scattergl, and line smoothing has been improved. 

A complete list of features and bug fixes is available in the project’s GitHub repository.