Google opens up FuzzBench so fuzz fans can compare and contrast their techniques

Google opens up FuzzBench so fuzz fans can compare and contrast their techniques

Google has made another step forward in its effort to open up the joy of fuzzing to all, with the launch of fuzzer benchmarking as a service.

Fuzzing is an automated software testing technique that aims to find bugs by inputting malformed or semi malformed data, and…seeing what happens, and Google said it has found “tens of thousands of bugs with fuzzers like libFuzzer and AFL”.

But, as the ads to cloud giant says in this week’s announcement, “it is hard to know how well these new tools and techniques generalize on a large set of real world programs.”

Not least this is because full scale experiments can be hideously expensive, with Google saying “a 24-hour, 10-trial, 10 fuzzer, 20 benchmark experiment would require 2,000 CPUs to complete in a day”.

Google’s answer is its newly polished and released FuzzBench, which it describes as “a fully automated, open source, free service for evaluating fuzzers”, courtesy of its OSS-Fuzz team.

Apparently the service means “researchers can simply integrate a fuzzer and FuzzBench will run an experiment for 24 hours with many trials and real world benchmarks.”

According to the project’s GitHub page, Fuzzbench consists of an “easy” API, along with benchmarks from real-world projects, and a reporting library to produce graphics and statistical tests, “to help you understand the significance of tests.”

“Based on data from this experiment, FuzzBench will produce a report comparing the performance of the fuzzer to others and give insights into the strengths and weaknesses of each fuzzer.”

The upshot is that researchers can focus on sharpening up their fuzzing techniques, rather than on “setting up evaluations and dealing with existing fuzzers”.

The reports include tests to give an idea how likely it is that performance differences between fuzzers are simply due to chance, and researchers can do their own analysis in the raw. “Performance is determined by the amount of covered program edges, though we plan on adding crashes as a performance metric,” the vendor said.

Google open sourced its ClusterFuzz project a year ago, designed to offer fuzzing at scale – Google claimed to have it running on over 25,000 cores for Chrome.