One for you, one for me: TensorFlow 2.4 lands with stable multi-worker strategy

One for you, one for me: TensorFlow 2.4 lands with stable multi-worker strategy

After polishing release candidates for six weeks, the TensorFlow team topped off 2020 by lobbing version 2.4 of the deep learning framework into the hands of machine learning practitioners.

In TensorFlow 2.4, the development team graduated the MultiWorkerMirroredStrategy API into a stable feature, improving the handling of peer failure and fixing bugs along the way. Researchers can use the approach to distribute model training across multiple workers and GPUs to speed up the process. And as the profiling and tracing of multiple workers comes as part of the 2.4 package, they will also have a chance to find out what went wrong should things go pear-shaped.

Those seeking to use more data during training can try their hand at parameter server training, an asynchronous method which has been implemented as a preview for version 2.4. 

Devs familiar with numerical computation library NumPy will no doubt be excited to learn that TensorFlow now comes with an experimental API that implements a subset of the lib. The interface is meant to let associated code use acceleration features in TF and access its APIs for more flexibility.

As TensorFlow is best used in concert with accelerators, efforts have been made to let the framework play more nicely with Nvidia’s Ampere architecture. As a result TF 2.4 now supports the TensorFloat-32 math mode on such hardware.

Delving deeper into the TensorFlow core component, users will discover a new experimental Union type which can be used as type annotation for convertible variables, a function to learn the total memory usage of a device, and a StatelessCase op. Other enhancements include tf.SparseTensor.with_values which returns a SparseTensor with the same sparsity pattern but newly provided values, support for non-boolean arguments in the Python bitwise operators for Tensor, and a fix to have tf.debugging.assert_shapes() work on SparseTensors.

In the run up to the release, the Keras Functional API underwent a “major refactoring”, which is a good idea because the overlying module has more or less turned into the official home of the Keras project and is already a hive of developer activity. 

The renewal however means that users are advised to look at the list of breaking changes before updating their installation to ensure that the modifications don’t render old code unusable. Code relying on the exact names attached to symbolic tensors or the number and names of the op layer that TensorFlow operations were covered into, for example, might not work without some tweaking. 

The Keras devs stabilised its mixed precision API, so that it now officially supports the use of 16-bit floating point formats during training. As with most changes in this release, this is meant to up performance on GPUs and TPUs.

The team behind the data module has been busy as well, fitting tf.data with processes to register and consume data from datasets via the tf.data service; ways to share dataset graphs through a shared filesystem (not via RPC); and a mode that distributes datasets across workers instead of providing each with the full set. Details can be found in the TensorFlow repository.