Google has added the option of Nvidia GPUs to its AI Platform as part of an overhaul of the as a service platform.
As Google explains, “ML models are so complex that they only run with acceptable latency on machines with many CPUs, or with accelerators like NVIDIA GPUs. This is especially true of models processing unstructured data like images, video, or text.”
Which would be why the service’s one current generally available option of one vCPU, 2GB of RAM, and no GPU support seems a little bare bones – though it does support all types of models artifacts, with a maximum model size of 500MB. This basic tier is now joined by a beta option of 4 vCPUs.
However, there are a range of other options available in beta – if your preference is for TensorFlow SavedModels.
These are built around permutations of 2 to 32 vCPUs, up to 208GB of RAM and GPU support. The supported model size is up to 2GB. GPU options include Nvidia’s Tesla K80, P4, P100, T4 and V100, with up to 8 of each available, depending on your chosen machine type.
The cornucopia of machine options comes as part of an overhaul of the platform’s backend, including the not terribly surprising news that it is now built on Google Kubernetes Engine. Google has also tied the platform to its other data products, with users now able to log their prediction requests and responses to the vendor’s BigQuery platform.
All of this will, Google says, ensure that “Application developers can access AI without having to understand ML frameworks, and data scientists don’t have to manage the serving infrastructure.”
However, while that original machine type costs $$0.0401 per node hour for online prediction, the new options range from $0.1349 up to $1.8928 per node hour, while GPUs cost extra. So somewhere along the line, someone will want to be managing the cost of all this newly available horsepower.