Last updated on 03 October 2022 by Robert Bateman (Privacy and Data Protection Research Writer at TermsFeed)
The EU's General Data Protection Regulation (GDPR) gives strict rules about what you can and can't do with the personal data you collect.
This can include something as common and basic as using a web server to collect log data.
Let's take a look at what the GDPR says when it comes to how you must handle log data in a compliant manner.
Personal data, according to Article 4 (1), means information that can be used to identify a person.
There are countless examples, such as:
According to Article 3 (2), the GDPR applies to anyone - based anywhere in the world - who:
Also according to the GDPR, data protection is a basic human right. Recital 1 of the GDPR states that "everyone has the right to the protection of [their] personal data."
Log data is collected by applications, websites and instant messaging platforms to record the interactions between a user and a system. Log files hold a record of activity on a web server, and can be used to identify things such as:
Log files can contain any information that a website admin wants. At the very least, they will usually contain an IP address.
There had been some question in the past as to whether IP addresses - particularly dynamic IP addresses, which change each time a user connects to the internet - are considered to be personal data. This question was largely laid to rest in 2016 when the Court of Justice of the European Union ruled on the case of Brewer v Germany:
Because they can be used to identify a person, dynamic IP addresses should be treated as personal data. This is confirmed in the GDPR, for example at Recital 30:
"persons may be associated with online identifiers provided by their devices, applications, tools and protocols, such as internet protocol addresses"
To comply with the GDPR, you need to think carefully about:
You must make sure that everything you do in relation to log data complies with the six principles of data processing set out in Article 5 of the GDPR. These principles provide the foundation for all data processing activity.
We'll look at how you can apply five of these six principles to your processing of log data. The remaining principle is accuracy. Accuracy is not very relevant to this type of processing. It will suffice to say that you must always try to make sure that any personal data in your possession is accurate.
Compliance with this principle requires that you:
According to Recital 58 of the GDPR:
"The principle of transparency requires that any information addressed to the public or to the data subject be concise, easily accessible and easy to understand, and that clear and plain language [...] be used."
This means that although the legal and technical details might be complicated, you must find a way to explain them to your users in terms that they'll understand.
Processing personal data can only take place on one of six lawful bases set out at Article 6 of the GDPR:
You need to determine your legal basis for any use of log data containing personal data.
Legitimate interests is the most flexible of the legal bases, but not necessarily the appropriate for each situation. You can only process personal data on the basis of your legitimate interests if you've considered how you might impact upon your users' privacy rights, and if you're treating their data in a way that they would reasonably expect.
You can rely on your legitimate interests in processing personal data if you undertake a Legitimate Interests Assessment and satisfy the three-part test:
Purpose - are pursuing a legitimate interest?
For example, you might be logging IP addresses of visitors to your website in order to maintain security.
Recital 49 of the GDPR specifically says that the "processing of personal data to the extent strictly necessary and proportionate for the purposes of ensuring network and information security" is a legitimate interest.
Necessity - is the processing necessary to fulfill your legitimate interests?
Is there another way that you might fulfill this purpose, that has a smaller impact on your users' privacy? In the case of logging IP addresses to maintain your website's security, probably not. If your website is being repeatedly targeted by spammers, for example, you need to know something about who's doing it.
Balancing - do your interests override your users' rights to privacy in this case?
You need to balance your interests against your users' rights to data protection. If you're taking appropriate care to process their IP addresses in a way that exposes them to minimal risk, i.e. by only storing this data securely, and only for as long as is necessary to achieve your aims, then this should represent an overriding legitimate interest.
Under the GDPR, consent must be:
According to Recital 42 of the GDPR:
"Consent should not be regarded as freely given if the data subject has no genuine or free choice or is unable to refuse or withdraw consent without detriment."
The implication of the rules around consent is that you shouldn't be relying on consent to collect data that you need to keep your website or app running securely. But you should usually ask for consent to process data for other purposes.
The only personal data that most log files contain is an IP address. But your log files could contain all sorts of personal data.
Some log data might be essential for your purpose of maintaining your website. Other information might just be helpful in allowing you to improve your website by testing bugs.
These are two different data processing purposes. You may need to rely on a separate legal basis for each.
A delivery form on your website asks your users for their mailing address. The form crashes your website roughly 10 percent of the times it is used. To find out why, you log the mailing address data of everyone who is using the form. You discover that the form crashes when addresses are inputted using a particular format.
You now have mailing address data in your log files. You're processing people's data in a way that they might not reasonably expect. If you have to carry out this type of data processing, you should consider seeking your users' consent.
Google Chrome asks users for consent to send crash reports when they first download its web browser app:
Clicking on "Learn more" takes you to this informative page that helps users make a decision about how their personal information is used by Google:
Note that Chrome includes cookies in its log data collection. Certain cookies are also considered personal data that require consent to use.
The European Data Protection Supervisor states that you must "limit processing of data through an IT system to its primarily specified purpose."
Recital 50 of the GDPR states that you can only process data for the purposes for which you collected it. If your server logs are designed to help you maintain the security and functionality of your site, you can't use the personal data they contain for any other purpose.
You tell your users that you log referral data - information about how they got to your website and which website they'd been on previously - for security purposes. This can be a legitimate way to use this type of data, as you might notice that a large number of users are being referred to your site from a particular phishing site.
Website admins can also use referral data in their logs to analyze the effectiveness of marketing campaigns. This can also be perfectly legitimate. But it's an example of "further processing" which might not be compatible with the purposes for which you originally collected this data.
Remember that not all of your log data is personal data - but some of it is. You must make sure that your users know about the purposes for which you use the personal data in your log files.
Data collection in server logs should be turned "off" by default. If you must collect personal data in your log files, Article 5 (1)(c) states that it must be "adequate, relevant and limited to what is necessary."
There are obvious ways to comply with this principle, including not logging any personal data you don't need. If you're logging usernames, payment card details, locations, etc. unnecessarily, it should be easy enough to stop doing this by editing your data collection forms and methods.
What's less obvious is how to minimize the collection of more typical types of log data such as IP addresses and other technical information.
The Internet Engineering Task Force's Internet Area Working Group (IntArea) has produced some draft guidance which makes a suggestion about how to minimize the data collected for your log files:
"keep only the first two octets (of an IPv4 address) or the first three octets (of an IPv6 address) with remaining octets set to zero, when logging"
For example, here are the first two octets of an IPv4 address:
IntArea also suggests that you don't:
"log unnecessary identifiers, such as source port number, time stamps, transport protocol numbers or destination port numbers."
There was a time where everyone involved in running a website or an online business was clamoring to obtain as much personal data about website visitors as possible. Those days are over.
Article 5 (1)(e) of the GDPR states that personal data must be stored "for no longer than is necessary for the purposes for which the personal data are processed."
This period will vary depending on the reason you collect a particular type of log data. IntArea suggests that you shouldn't:
"store logs of incoming IP addresses from inbound traffic for longer than three days [...] a three-day logging period covers a week-end, which is convenient for professional server providers."
Many companies choose to delete log files after 7 days. You can automatically delete old log files with a service like Logrotate.
Here's how AdBlock does this:
It's very important that you keep your users' personal data safe. This applies as much to log files as it does other kinds of personal data. You must have systems in place for preventing a data breach and reporting any data breaches that do occur.
In Recital 108, the GDPR advocates "data protection by design and by default." This means that you should develop your website or app in such a way that builds secure data processing into its core functioning.
If you suspect that you have fallen victim to any kind of data breach, Article 33 dictates that you must report it to your supervisory authority (data protection authority) as soon as possible. Under certain conditions you must also alert your users directly.
Article 35 of the GDPR also requires you to carry out a data protection impact assessment if you're carrying out risky data processing. You might need to carry out a data protection impact assessment if, for example, you're using new technology that might make your log data vulnerable until bugs are worked out.
It might not be immediately apparent that log data is subject to the GDPR. However, because log files can contain personal data, the principles of data processing apply here the same as they apply to the processing of any other personal data.
Make sure that your processing of log data complies with the principles of:
This article is not a substitute for professional legal advice. This article does not create an attorney-client relationship, nor is it a solicitation to offer legal advice.
03 October 2022