Kubeflow 1.4 cuts down on duplicates, gets automating processes

The planned release date of September 27th came and went, but after some last minute adjustments, machine learning on Kubernetes project Kubeflow has finally landed in version 1.4. For the second bigger release of the year, the Kubeflow team put a focus on simplifying operations and streamlining machine learning workflows in order to keep the project maintainable and get more people interested.

Under the hood, Kubeflow learned to use additional metadata in its pipeline orchestration and model monitoring efforts, and now includes stabilised v2 protocols in the KFServing and KFPipelines components. The developers also worked to reduce the amount of redundant code in the project by replacing the training operators for TensorFlow, PyTorch, XGBoost, MXNet, BytePS, and LightGBM with a universal one and developing common code for the available web apps. 

Since the web apps still used the no-longer-supported version 8 of JavaScript framework Angular, they have been updated to version 12 and now come fitted with Angular’s internationalisation implementation. Building processes have been evaluated as well and now use a higher degree of automation to make them both faster and less error-prone. 

Similar improvements were hoped to be included through the refactoring of Kubeflow’s manifest files. However, some PRs that were necessary to see this through were still under review when the release was cut, so it might still take a little longer until admins get to enjoy a simplified installation process.

Long-time users of Kubeflow will be quick to notice the project’s central dashboard now comes with a menu item for the Models web app and allows the addition of new ones should a cluster admin want to integrate third-party applications into the navigation sidebar.

Other than that, the Kubeflow team was able to fix some issues with autoscaling GPU nodegroups, limit calculations, and MountPath parsing in the Jupyter web app, and correct things in the controller watches and websocket handling in notebooks, which should help smooth out the user experience. Additional details can be found in the Kubeflow release notes.

Kubeflow is a Google-initiated open-source project that is meant to help users deploy their machine learning workflows on Kubernetes. It is based on the company’s own method to deploy TensorFlow models, which is another of Google’s projects.