ML librarians at PyTorch – Facebook’s AI crew – have pushed out version 1.8, with added support for AMD ROCm, which can now be more easily run in a native environment without having to configure Docker.
The support is provided through binaries that are available via pytorch.org, with onlookers saying the move is a sign of confidence “about the quality” of support for the open-source universal platform for GPU-accelerated computing. A “beta” feature in the release, the developers said users need to navigate to the standard PyTorch installation selector, choose ROCm as an installation option and execute the provided command.
The move could see ML wranglers paying more attention to the AMD GPU platform, potentially for small-scale machine learning (ML) training and GPU-based ML inference. In January last year, AMD contributed a backend to Microsoft’s deep learning runtime to support its chips – and accelerated both TensorFlow and PyTorch code.
Also under the umbrella of hardware support, the PyTorch team has provided the ability to extend the PyTorch Dispatcher for a new backend in C++.
The release has packed in several new and updated APIs: Fast Fourier Transformations (torch.fft) and Linear Algebra (torch.linalg), among others, have been added or stabilized.
The torch.fft module itself was described in the release by PyTorch as an investment in its “goal to support scientific computing,” and implements the same functions as NumPy’s np.fft module, only with support for hardware acceleration and autograd.
Support for doing python to python functional transformations via torch.fx was also squeezed into the release, though this is currently a beta feature.
1.8 release is loaded with a new set of tutorials to help new PyTorch Mobile users to quickly launch models for iOS or Android as well as demo apps with examples of image segmentation, object detection, neural machine translation, question answering, and vision transformers.
Updates were also made to a number of PyTorch libraries, such as Torchvision, which includes the team’s “first” on-device support and binaries for a PyTorch domain library. And there were tweaks made to TorchCSPRNG, TorchText and TorchAudio.
Finally, improvements were implemented for distributed training including NCCL reliability, Pipeline parallelism support, and RPC profiling, plus support for communication hooks adding gradient compression.