Differential Privacy
Contents
Differential Privacy#
In brief#
Differential privacy implies that adding or deleting a single record does not significantly affect the result of any analysis.
More in detail#
The Family of Differential Privacy Models#
Differential privacy is a prominent family of privacy-preserving data publishing models (see Privacy Models). It comprehends privacy as the ability to set a limit on the impact of any single individual on the outputs of the function that produces the information to publish (computation, e.g., of a set of statistics, of a machine learning model, of generated synthetic data). In other words, a differentially private function promises to each individual that its outputs will be more or less the same whether the individual’s data is input by the function or not. Differential privacy models all share this common intuitive goal but they differ in the way they formalize it - for example, on the quantification of the impact of an individual or on the tolerance to possible failures of the guarantees (though improbable). They usually exhibit properties that have been identified as key requirements to privacy models.
Achieving Differential Privacy#
Designing a function that satisfies differential privacy often boils down to carefully combining basic perturbation mechanisms (such as, e.g., the Laplace mechanism) and to demonstrating formally either that data is only accessed through a differentially private function (leveraging the safety under post-processing and the self-composability properties), or that the output distribution of the function complies with the targeted differential privacy model (through, e.g., randomness alignments). We refer the interested reader to Achieving Differential Privacy for more information.
An expanding universe#
The seminal differential privacy models were proposed in the mid-2000’s and include \(\epsilon\)-differential privacy (see \(\epsilon\)-Differential Privacy) or \((\epsilon, \delta)\)-differential privacy (see (\(\epsilon\),\(\delta\))-Differential Privacy). The number of differential privacy models has grown fastly over the years (more than 200 extensions or variants have been reported in a 2020 survey paper). Differential privacy is often considered in the academia as a de facto standard for privacy-preserving data publishing and has earned the original authors the prestigious Gödel Prize in 2017. Famous organizations (e.g., the US Census Bureau) and companies (e.g., Google, Apple, LinkedIn, Microsoft) have launched ambitious real-life applications of differential privacy.
Bibliography#
The seminal differential privacy models were introduced in [1], [2], and [1]. Differential privacy is thoroughly introduced in [4] and numerous variants and extensions are surveyed in [5]. The book [6] surveys differential privacy techniques related to database queries.
- 1
Cynthia Dwork. Differential privacy. In ICALP. 2006.
- 2
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. In TCC. 2006.
- 3
Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: privacy via distributed noise generation. In EUROCRYPT. 2006.
- 4
Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., 9:211–407, 2014.
- 5
Damien Desfontaines and Balázs Pejó. Sok: differential privacies. Proceedings on Privacy Enhancing Technologies, 2020:288 – 313, 2020.
- 6
Joseph P. Near and Xi He. Differential privacy for databases. Foundations and Trends in Databases, 11:109–225, 2021.
This entry was written by Tristan Allard.