Google hands developers tool for preserving individual’s privacy when analysing (lots of) data

Google hands developers tool for preserving individual’s privacy when analysing (lots of) data

Google has open-sourced a tool it says allows it to analyse vast amounts of data, without compromising individual’s privacy.

According to a blog by Miguel Guevara, of Google’s Privacy and Data Protection Office, the Differential Privacy project is an “open-source version of the differential privacy library that helps power some of Google’s core products. “

As Guevara explains, “Differentially-private data analysis is a principled approach that enables organizations to learn from the majority of their data while simultaneously ensuring that those results do not allow any individual’s data to be distinguished or re-identified. “

The GitHub page for the project, describes it as a “C++ library of ε-differentially private algorithms, which can be used to produce aggregate statistics over numeric data sets containing private or sensitive information. “

It includes algorithms for count, sum, mean, variance, standard deviation and order statistics, and a stochastic tester to “check the correctness of the algorithms”.

According to the abstract of a paper accompanying the announcement, “Differential privacy (DP) provides formal guarantees that the output of a database query does not reveal too much information about any individual present in the database. “

The abstract goes on, “We propose a generic and scalable method to perform differentially private aggregations on databases, even when individuals can each be associated with arbitrarily many rows. We express this method as an operator in relational algebra, and implement it in an SQL engine.”

Back in July, Google open sourced Private Join and Compute, another privacy technology designed to allow organisations to work gain aggregated insights from confidential datasets without compromising individuals’ data.

The moves give developers and data scientists an increasingly broad tool set to derive insights from data while doing what they can to ensure individual’s privacy. Whether such moves will give reassurance to the broader population that their data is safe in the hands of Google and companies using its technology is quite another challenge.