TensorFlow 2.8 lends hand at text processing, continues to lure devs onto more powerful hardware

TensorFlow 2.8 lends hand at text processing, continues to lure devs onto more powerful hardware

Machine learning practitioners with an interest in hardware acceleration can find some new things to try in the just released version 2.8 of the TensorFlow framework.

While the update doesn’t bring an awful lot of new features, it surely advances some of the more recent performance-related additions as the number of enhancements in the still experimental tensorrt and embedding APIs show. TensorRT is a SDK for “high performance deep learning inference” offered by GPU company Nvidia. In v2.8, TensorFlow’s integration with the project provides more insight into the inference converted by the TF-TRT component and allows users to prevent the framework from saving TRT-specific engines, thus reducing resource usage.

The embeddings APIs, which are part of the component realising cooperation with Google’s tensor processing units, meanwhile saw the addition of a new argument for specifying the shape of the output activation of a feature and some harmonisation in behaviour for TPUEmbedding and serving_embedding_lookup

TensorFlow’s module for building input pipelines tf.data will parallelise the copying of batch elements starting with the current release and have an easier time with file input, as TensorSliceDataset now knows how to identify and handle it.

Outside of the more core modules developers also get to find a few interesting changes. Tf.keras, a deep learning API which was transferred into its own repository in August 2021, for instance includes a new random value generator for Keras initialisers and all RND code, as well as an output_mode argument for the Discretization and Hashing layers. New modes for TextVectorization mean users have an easier time lowercasing inputs, removing punctuation, or splitting text on unicode characters respectively. 

Mobile-focused TFLite comes fitted with GPU delegation support for serialisation to the Java API, and ways to work with some random value generators, the where, and the raw_ops.Bucketize operators. Developers who used Interpreter::SetNumThreads before should adjust their code to employ InterpreterBuilder::SetNumThreads instead, as the former has been deprecated with the release. 

The TensorFlow team also advises to generally replace all boosted tree code with TensorFlow Decision Forests, as the boosted tree code introduced some security issues and will be eliminated before the 2.9 release. More details are available via the release notes.