Google tries to turn over new LEAF for audio classification tasks

AI/ML

By Team Devclass

March 15, 2021

Google tries to turn over new LEAF for audio classification tasks

Researchers of Google’s AI division have come up with a reworked approach to audio classification tasks, aiming for improved performance and better adaptability for a range of application domains.

In a paper that will be presented at this year’s ninth International Conference on Learning Representations (ICLR) in May, the researchers will introduce its Learnable Audio Frontend LEAF, which instead of relying on various fixed operations, works with learned ones.

The work largely circles around the fact that unlike other fields in machine learning that can use raw data, “deep neural networks for audio classification are rarely trained from raw audio waveforms”. This is largely due to the fact that ML systems are often designed to mimic the human signal processing apparatus. In audio-related scenarios, this means data has to be preprocessed to match the way in which humans perceive frequencies, giving more importance to the lower spectrums.

This, however, might not always be necessary or even helpful, the researchers pointed out, using the example of recognising whale calls. They thus propose a more tailored approach. LEAF is guided by the steps used when creating the traditionally used mel filterbanks: windowing a signal to capture a sound’s time variability, filtering, and compressing. Instead of using the fixed layers designed to get close to how a human would perceive a signal, LEAF looks to learn the operation best suited for the use case at hand (think learned scale vs fixed scale for pitch).

Since systems based on mel filterbanks are also said to be less than ideal when working with noisy data, switching to LEAF could help if that’s all there is available. Of course this weakness alone has already spurred lots of research into learnable alternatives, most options currently available seem to lack in the performance department, though.

This is mostly down to the fact that having a trainable system means there are training parameters that need to be optimised in order to get good results. LEAF tries to work around this by using Gabor convolution layers, which sport just two parameters per filter.

Google’s first test results show LEAF scoring higher than the filterbanks on average accuracy across different tasks. Devs interested in the approach can find an implementation in TensorFlow on GitHub to verify the team’s findings. To make this easier, LEAF is designed as a drop-in replacement for mel filterbanks, since “any model that can be trained using mel filterbanks as input features, can also be trained on LEAF spectrograms.”

In future the team plans to get rid of the convolutional architecture with its fixed filter lengths and strides, and replace it with a system that can learn these elements to remove bias. It also expects some benefits for the analysis of seismic data and physiological recordings when using its general principle of learning to filter, pool and compress, so forays into the realm of non-audio signals could be next for at least some of the researchers.

Google tries to turn over new LEAF for audio classification tasks

Docker adds AI agents to Compose along with GPU-powered cloud Offload service

Microsoft SQL Server MCP tool: 'Leap in data interaction' or limited and frustrating?

Google positions itself for 'next decade' of AI as Gemini CLI arrives with generous free tier

CloudBees opens MCP server so agents can infiltrate DevOps

AI is generating code at scale – but human scale code review can’t keep up

Redefining identity security in the age of agentic AI

GitLab warms up investors for winter release of agentic AI flavoured Duo Workflow

New Relic aims to crack open MCP servers

Shadow AI in the enterprise: managing risk without slowing progress

Cursor AI editor hits 1.0 milestone, including BugBot and high-risk background agents

Node.js frustrating and inefficient? OpenAI rewrites AI coding tool in Rust

Researchers warn of prompt injection vulnerability in GitHub MCP with no obvious fix

ABOUT US

FOLLOW US