Data Anonymization (and Pseudonymization)#

In Brief#

A data subject is considered anonymous if it is reasonably hard to attribute his personal data to him/her.

More in Detail#

A data subject is considered anonymous if it is reasonably hard to attribute his personal data to him/her. What “reasonably” actually means depends both on the context and on the requirements given by data respondents. Both the identity of a subject and other information related to him/her are considered in the context of anonymity, for example sensitive information regarding health, religion, political tendencies and, more in general, any kind of information that can somehow distinguish the subject from others. The European General Data Protection Regulation (GDPR [3]) defines anonymous data as

“[…] information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable”.

Thus, data to have this property has to be deprived of all distinctive elements of a person, i.e., those elements that permit to identify both directly or indirectly that person in the data. Since anonymous data does not enable re-identification of data subjects, even with the use of additional information, this type of data is not subject to the privacy regulations since it is not considered personal data.

Note that this is process is absolutely different from removing the direct identifiers only (e.g., name, surname, social security number). This process is called de-identification, and it can be subjected to privacy leaks, such as re-identification. The process of substitute a set of direct identifiers with a surrogate value, or psudonym, is called Pseudonymization. It suffers from the same weaknesses of de-identification. Nevertheless, the GDPR strongly advocates the application of Pseudonymization:

“The application of pseudonymisation to personal data can reduce the risks to the data subjects concerned and help controllers and processors to meet their data-protection obligations” (Recital 28).

“In order to be able to demonstrate compliance with this Regulation, the controller should adopt internal policies and implement measures […]. Such measures could consist, inter alia, of […] pseudonymising personal data as soon as possible […]” (Recital 78).

“The further processing of personal data for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes is to be carried out […] provided that appropriate safeguards exist (such as, for instance, pseudonymisation of the data)” (Recital 156).

Bibliography#

1: European Parliament & Council. General data protection regulation. 2016. L119, 4/5/2016, p. 1–88.

This entry was readapted from Comandé et al. Elgar Encyclopedia of Law and Data Science. Edward Elgar Publishing (2022) ISBN: 978 1 83910 458 9 by Francesca Pratesi, Roberto Pellungrini, and Anna Monreale.

The TAILOR Handbook of Trustworthy AI

Data Anonymization (and Pseudonymization)

Contents

Data Anonymization (and Pseudonymization)#

In Brief#

More in Detail#

Bibliography#