TensorFlow Lite pulls throttle, adds speed as it puts OpenCL in the sidecar

AI/ML

By Team Devclass

August 19, 2020

TensorFlow Lite pulls throttle, adds speed as it puts OpenCL in the sidecar

The team behind the GPU inference engine of mobile deep learning framework TensorFlow Lite successfully finished experiments with an OpenCL-based flavour for Android, promising up to double the speed when compared to its OpenGL counterpart.

The alternative backend has been part of the TensorFlow repository for about a year already, so it has been tested quite a bit. However the engineers behind the addition have waited until now to officially launch the engine, making it a bit more visible to developers using TFLite on Android devices.

An inference engine is a component used to apply learned rules to extract new information, which is useful in environments where coming up with completely new rules isn’t feasible due to limited resources – as is often the case on mobile or embedded systems.

The idea to give OpenCL a go was motivated by the fact that the framework was designed with the use of various accelerators in mind. Meanwhile Open Graphics Library OpenGL, which is normally used in TensorFlow Lite when GPUs are roped into the inference process, only learned to work with certain components pretty late. As a consequence, its API has the burden of needing to stay backward compatible, which the team felt sometimes keeps them from getting the most of a device’s GPU.

Other than the easy accelerator use, OpenCL offers good profiling options which help to uncover potential for optimisation, supports 16-bit precision floating point, and comes with constant memory, which has proven efficient in certain layers of a neural network. Making use of these features in a OpenCL backend saw the inference engine working twice as fast as the usual OpenGL solution, especially on the Adreno GPU series developed at Qualcomm.

While this all sounds promising, the new backend has a major drawback: OpenCL isn’t part of the standard Android distribution and might therefore not be available to everyone. To work around that, the TFLite GPU delegate was fitted with a checking mechanism which employs OpenCL once found and falls back to OpenGL should the other framework not be available.

TensorFlow Lite pulls throttle, adds speed as it puts OpenCL in the sidecar

Google positions itself for 'next decade' of AI as Gemini CLI arrives with generous free tier

CloudBees opens MCP server so agents can infiltrate DevOps

AI is generating code at scale – but human scale code review can’t keep up

Redefining identity security in the age of agentic AI

GitLab warms up investors for winter release of agentic AI flavoured Duo Workflow

New Relic aims to crack open MCP servers

Shadow AI in the enterprise: managing risk without slowing progress

Cursor AI editor hits 1.0 milestone, including BugBot and high-risk background agents

Node.js frustrating and inefficient? OpenAI rewrites AI coding tool in Rust

Researchers warn of prompt injection vulnerability in GitHub MCP with no obvious fix

MCP will be built into Windows to make an 'agentic OS' but security will be a key concern

Stack Overflow seeks rebrand as traffic continues to plummet – which is bad news for developers

ABOUT US

FOLLOW US