Post-graduate law degree, CIPP/E from the International Association of Privacy Professionals (IAPP). Privacy and Data Protection Research Writer at TermsFeed.
On this page
- 1. What are Anonymization and Pseudonymization?
- 1.1. What is Anonymization?
- 1.2. What is Pseudonymization?
- 2. Benefits of an Anonymization and Pseudonymization Policy
- 2.1. It Ensures Proper Anonymization
- 2.2. It Ensures Proper Pseudonymization
- 2.3. Anonymization Reduces Your Compliance Burden
- 2.4. Pseudonymization Can Reduce Legal Liability
- 3. Outline of Your Anonymization and Pseudonymization Policy
- 3.1. Scope of the Policy
- 3.2. Purpose
- 3.3. Definitions
- 3.4. Roles and Responsibilities
- 3.5. Your Process
- 3.5.1. Process of Anonymization
- 3.5.2. Process of Pseudonymization
- 3.6. Applicable Laws and Regulations
- 4. The GDPR and Advantages of a Anonymization and Pseudonymization Policy
- 4.1. Which Data Masking Technique is Better?
- 5. Summary
Anonymization and pseudonymization are two important ways of protecting personal information within your organization.
These concepts have been central to EU data protection law for many years, but with strict privacy laws emerging in the U.S. and elsewhere, organizations all over the world should now be considering how best to integrate these practices into their operations.
In this article, we'll be explaining what the differences are between these two methods of data masking, how an Anonymization and Pseudonymization Policy can help protect personal information within your organization, and outline the contents of this type of policy so you can create your own.
One of our many testimonials:
What are Anonymization and Pseudonymization?
Anonymization and pseudonymization are two ways of processing personal information.
Anonymization and pseudonymization are not the same, as explained by the Article 29 Working Party, a now-retired working group that provided guidance on EU data protection law (at page 3 of the linked PDF):
Under EU law, anonymized data is not personal information, but pseudonymized data remains personal information.
What is Anonymization?
Accordingly, anonymization means removing any references to an identifiable person from a data set, thus turning personal information into non-personal information.
Given the changing nature of technology, it is possible that some anonymized data sets might, one day, be subject to re-identification. However, this should not be reasonably possible in the current technological climate.
Anonymization is commonly used to depersonalize personal information before processing it for statistical purposes.
What is Pseudonymization?
Pseudonymization is defined at Article 4 (5) of the GDPR:
A virtually identical definition appears at Section 1798.140 (r) of the California Consumer Privacy Act (CCPA):
There are several conditions inherent to this definition:
- After personal information has been pseudonymized, no individual can be identified from the personal information without reference to additional information
- The additional information must be kept separately from the pseudonymized personal information
- The additional information must be subject to technical and organizational safeguards (such as access controls) to keep them secure
If any one of these conditions is not present, then pseudonymization has not taken place.
Benefits of an Anonymization and Pseudonymization Policy
Maintaining an Anonymization and Pseudonymization Policy will help ensure you can demonstrate your compliance with the relevant laws and regulations, and it could reduce your liability under such laws.
Having a robust set of policies in place ensures everyone in your organization is on the same page when it comes to privacy and data protection, and this type of policy works in nicely.
It Ensures Proper Anonymization
Your Anonymization and Pseudonymization Policy will ensure that your employees understand the high standards that apply when anonymizing personal information.
The concept of "anonymization" is often misunderstood and misapplied.
A relatively common misapplication of "anonymization" involves the use of individuals' initials in place of their full name. This method is unlikely to even meet the threshold for "pseudonymized" personal information, let alone "anonymized."
In the context of EU law, anonymization is an effectively irreversible process that prevents the identification of individuals (using current technology). This is a high threshold.
It Ensures Proper Pseudonymization
Your Anonymization and Pseudonymization Policy will ensure that your employees understand that the re-identification of individuals from pseudonymized personal information must be very unlikely without reference to additional information.
The Policy will also make clear that any additional information used to re-identify individuals must be stored separately from the personal information, and appropriate security measures and access controls must be applied to it.
As with anonymization, the threshold for pseudonymization is also relatively high.
To return to the example above, the use of an individual's initials in place of their full name would probably not meet the threshold for pseudonymization.
Your employees must also understand that pseudonymized data remains "personal information," and, therefore, it must be presented to individuals in an intelligible form if they request access to it.
Anonymization Reduces Your Compliance Burden
Another key benefit of creating an Anonymization and Pseudonymization Policy is that it will help ensure employees are anonymizing personal information wherever possible and appropriate.
The benefits of anonymization are clear. The GDPR states that data protection law does not apply to anonymous data at Recital 26:
This is confirmed by the Article 29 Working Party (at page 5, here):
The CCPA does not explicitly refer to anonymization, but given that the law uses a very similar definition of "personal information," it is reasonable to assume that anonymized data also falls outside of the scope of the CCPA.
Because privacy law no longer applies to anonymized data, this means:
- There is no need to store it securely (unless it is sensitive or valuable for other reasons)
- Individuals can no longer exercise their data rights over it (therefore you will not need to provide access to it, delete it, or rectify it on request)
- If it is compromised in a data breach, you will not have to notify the authorities or the individuals affected
Note, however, that other laws and regulations may still apply to anonymized data.
Pseudonymization Can Reduce Legal Liability
Anonymization and Pseudonymization Policy will ensure that your employees pseudonymize personal information whenever required.
The GDPR presents pseudonymization as a valid means of mitigating risks to privacy and fulfilling certain obligations to protect personal information. For example, at Recital 28:
At Article 25, pseudonymization is presented as a means by which to implement the principle of "data protection by design and by default":
Encrypting personal information also allows your company to escape liability for data breaches under some privacy laws (encryption being, under certain conditions, a form of pseudonymization).
For example, the New York Shield Act applies to businesses holding the "private information" of New York residents. Private information is defined as certain types of personal information that have not been encrypted. Therefore, it is possible that encrypting personal information could allow a business to escape the jurisdiction of this law altogether.
In addition to mitigating the risks once personal information has been compromised, pseudonymization also reduces the likelihood of a cyberattack. Ultimately, pseudonymous data is worthless to an attacker who does not have access to the additional information required to identify individuals.
Outline of Your Anonymization and Pseudonymization Policy
Your reasons for using anonymization and pseudonymization will be specific to your organization. Therefore, every Anonymization and Pseudonymization Policy will be unique. However, here are some of the sections that are common to most Anonymization and Pseudonymization Policies, together with some examples from real organizations.
Scope of the Policy
The "Scope" section of your Anonymization and Pseudonymization Policy explains whom the policy applies to and what activities it covers.
For example, your policy may apply to all members of staff and contractors, and it may apply to particular types of personal information or all personal information.
Here's an example of how to craft a "Scope" section from the Dundalk Institute of Technology:
Note that the policy applies whenever employees and third parties engage in the anonymization of personal information. When doing so, these parties must abide by the process and principles set out in the policy.
The "Purpose" section of your Anonymization and Pseudonymization Policy sets out the reasons for which the policy exists.
Broadly speaking, this is to protect personal information. But your policy may serve a more specific purpose within the context of your organization.
Here's an example of such a clause:
In addition to setting out the purpose of your policy, you may also wish to describe the purposes of anonymization and pseudonymization.
Here's an example that explains the benefits of anonymization to the reader:
Explaining the benefits of anonymization and pseudonymization will help ensure your employees adhere to your policy.
Your Anonymization and Pseudonymization Policy should provide definitions of anonymization and pseudonymization that are valid under the laws relevant to your organization. It should also define other terms commonly used throughout the policy.
Here's an example from Leicester City Council:
Roles and Responsibilities
Your Anonymization and Pseudonymization Policy should explain who within your organization has responsibility for enforcing the policy and who is responsible for carrying out the processes of anonymization and pseudonymization.
Here's an example of how you can clearly set out different roles and responsibilities:
Your Anonymization and Pseudonymization Policy should set out a standardized process by which anonymization and pseudonymization must take place.
This is important to help ensure that your employees anonymize and pseudonymize personal information in a legally-compliant way.
The process will be largely specific to the context in which your organization operates, but there are certain standards that must nonetheless be met.
Process of Anonymization
An effective anonymization process will make the re-identification of an individual from the anonymized personal information very unlikely.
The Article 29 Working Party provides a three-part test for assessing the effectiveness of an anonymization method.
Once the anonymization process has been carried out, the following three conditions should apply:
- It is no longer possible to single out an individual
- It is no longer possible to link records relating to an individual
- No information can be inferred concerning an individual
Here's an example of how to carry out effective anonymization, from Northumberland County Council:
The policy provides several methods of anonymization (referred to as "de-identification"), including using date ranges instead of age (e.g. 25-35 instead of 30).
Process of Pseudonymization
The process of pseudonymization must ensure that individuals can be reidentified, but only where necessary.
Some methods of pseudonymization include encryption, hashing, and tokenization. However, these measures must meet certain conditions above before they can be considered pseudonymization methods.
For example, encrypted data is only pseudonymized if the encryption key is kept separately, securely, and with limitations on access.
Here's how Northumberland County Council describes some of the standards that must be met when pseudonymizing personal information:
Applicable Laws and Regulations
Your Anonymization and Pseudonymization Policy should list the laws and regulations that apply to when anonymizing and pseudonymizing personal information.
Depending on where your customers are based, such laws may include:
- Data Protection Act 2018
- Canadian Personal Information Protection and Electronic Documents Act (PIPEDA)
- Health Insurance Portability and Accountability Act (HIPAA)
- ISO/IEC 27000:2018 IT security standards
Here's an example from Cambridgeshire County Council:
The GDPR and Advantages of a Anonymization and Pseudonymization Policy
The GDPR makes numerous references to data masking techniques such as anonymization and pseudonymization.
Here are some examples:
Article 5 - Data Processing
In Article 5, the GDPR states that personal data should be retained only as long as it is necessary to provide a service. After that, it may be retained if the data no longer permits the identification of individuals:
Article 25 - Data Protection by Design
In Article 25, the GDPR describes the requirement of businesses to take all reasonable measures to protect consumer data, by default and by design. It specifically mentions pseudonymization as a way to accomplish this:
GDPR Recital 26
In Recital 26, the GDPR specifies that certain data protection measures will not apply to anonymous information that can no longer identify a natural person:
Article 32 - Security of Processing
Security is a key point of the GDPR. Article 32 specifically mentions pseudonymization as an appropriate measure of security to protect the privacy of consumers:
Article 34 - Informing Data Subjects of a Data Breach
According to Article 34, a company must inform users of a high-risk data breach that affects them unless organizational protection measures have rendered the information unintelligible or unidentifiable - such as through pseudonymization or anonymization:
Data masking is not absolutely required by the GDPR. However, it is highly recommended. In fact, the regulation offers incentives for implementing data masking techniques.
- Meeting security requirements: By using techniques like pseudonymization and anonymization, you will comply with the requirement that businesses implement all possible measures to protect consumer data. If it happens that your company is ever investigated for any reason, data masking will provide an additional safeguard if your data security protocols come into question.
- Communicating data breaches: If your data has been anonymized, you will not be required to inform (anonymized) users of a data breach that affects their information. This is because there won't be any way the breached data can be linked back to the individual.
- Complying with user rights: Once data has been anonymized permanently, you will no longer be expected to comply with user rights and demands, such as the consumer's right to be erased or the right to request a full copy of user data.
- Moving personal information across international borders: Although it is unclear what level of pseudonymization is required to transfer data over international borders without following a mountain of policies and red tape, data masking could reduce the number of hoops one must jump through to transfer data to another country.
Here's a graph from a Privacy Analytics white paper about the topic:
Which Data Masking Technique is Better?
When it comes to the question of pseudonymization versus anonymization, the business must consider its applications and usage of personal data.
Here are some situations in which you may want to use anonymization instead of pseudonymization:
- If you no longer need to communicate or work with a consumer, but wish to archive their activity, order history, or any other details that could not be used to identify them.
- To perform data analyses that are unrelated to the services you provide the consumer.
- If you need to make data available to a group of people outside those that are designated to fulfill your services, such as a wide group of employees or consultants.
On the other hand, data pseudonymization can be used when you will need to re-identify users in the future:
- To keep data secure during the fulfillment of services, by masking identifying details to employees or other data handlers that do not need those details.
- To maintain data protection within your database or records, order histories, and inactive customers that you remain in contact with.
- To transfer data over international borders.
- To maintain data protection and Privacy by Design principles laid out by the GDPR.
Overall, both of these methods present advantages under the GDPR, but may not be feasible for certain data sets or applications. Do your research about all of the implications before performing any data masking measures.
An Anonymization and Pseudonymization Policy can help ensure your employees are properly applying these important information security techniques to protect personal information within your information.
Some key parts of your policy should include:
- Scope: Describes which people and activities are covered by the policy
- Purpose: Describes the reasons for implementing the policy
- Definitions: Defines key terms used throughout the policy
- Roles and Responsibilities: Explains who is responsible for overseeing and carrying out the policy
- Process: Provides a legally compliant process for anonymizing and pseudonymizing personal information
- Applicable laws and regulations: Sets out the legal context in which your organization operates