Google Cloud’s AI Platform has opened the doors to a new Prediction product – AI Platform Prediction is mainly aimed at enterprises looking for a fully managed service that hosts their machine learning (ML) models and handles requests for them.
To make sure models are always available for clients, it is also said to provide automatic scaling and load balancing, while access and request/response logs can help to figure out what happened should something, God forbid, go awry. Additionally, Prediction gives insight into resource metrics such as GPU and network utilisation to optimise an ML model further.
Developers at Google have been quick to highlight the new service’s architecture, which is based upon a Kubernetes Engine backend. This is expected to make the system reliable, reduce overhead latency, and allow for a wide variety of hardware options.
Initially, the product seems to have allowed only the use of TensorFlow models, which isn’t massively surprising given it’s a Google project. As the managed service is mainly geared towards prediction scenarios, however, the deep learning-focused framework might be a bit overkill, which is why Prediction now also supports scikit-learn and XGBoost.
The team has also used the beta phase to add a few more regional endpoints for better availability and regional isolation, and upped the game on security by introducing the option to define security perimeters. Prediction models added to the latter can only use resources and services within the perimeter, which is mostly useful in scenarios where using public internet isn’t an option.
Google has had a foot in the door of the ML community for a while, mainly thanks to it’s widely-known TensorFlow project. The library, however, has serious competition in the form of Facebook’s PyTorch, which some developers prefer because it, for example, tends to be easier to debug.
Both projects use multi-dimensional matrices, so called tensors, for their computations. To speed those workloads, Google started to offer cloud access to its custom-made, TensorFlow-fit tensor processing units (or TPUs for short) in 2018. Since then, the integrated circuits have gone through some iterations, but the target always stayed the same.
This does not mean Google is indifferent to potential new user groups, in fact, the company partnered with Facebook on a Python package called PyTorch/XLA, whose sole purpose is to get the library running on TPUs. To make it official, Google this week announced general availability of PyTorch/XLA support for Cloud TPUs, meaning there’s an optimised set of deep learning model implementations available now.