The Working Party’s Views on Purpose Limitation and Big Data
By Stefano Tagliabue, CIPP/E
The concept of purpose limitation is a cornerstone of the protection of personal data. It is an essential first step in applying data protection laws since it constitutes a prerequisite for other data quality requirements, contributes to transparency and legal certainty and sets limits on how controllers are able to use personal data.
The Article 29 Working Party (WP), in its Opinion 3/2013 adopted last April, analyses in great detail this fundamental principle. Besides, this opinion offers interesting perspectives for those of us struggling to combine Big Data analytics with the protection of personal data and conformity with the law.
As the WP points out, under the EU Data Protection Directive the concept of purpose limitation has two main building blocks: The personal data must be collected for “specified, explicit and legitimate” purposes (purpose specification) and not be “further processed in a way incompatible” with those purposes (compatible use).
So, a compatibility assessment is a decisive test in order to determine whether any further use of personal data may be considered legitimate. In particular, account should be taken of the following key factors:
- the relationship between the purposes for which the data have been collected and the purposes of further processing,
- the context in which the data have been collected and the reasonable expectations of the data subjects as to their further use,
- the nature of the data and the impact of the further processing on the data subjects and
- the safeguards applied by the controller to ensure fair processing and to prevent any undue impact on the data subjects.
Failure to comply with the compatibility requirement has serious consequences: The processing of personal data in any way that is incompatible with the purposes specified at collection is unlawful. In this regard, the WP underlines that trying to legalise an otherwise incompatible data processing activity simply by changing the terms of a contract, or by identifying an additional legitimate interest of the controller, would go against the spirit of the purpose limitation principle and remove its substance.
Interestingly, the WP also notes that the directive contains a specific provision that allows further processing of data for historical, statistical and scientific research as long as appropriate safeguards are implemented to ensure that the data will not be used to support measures or decisions regarding any particular individuals. “Statistical purposes,” in particular, may range from analyses of public interests to commercial purposes, e.g., Big Data applications.
Big Data Applications
In the relevant section of the opinion, the WP acknowledges that, despite its potential for innovation, Big Data raises many concerns:
- the sheer scale of data processing, especially when data come from many different sources;
- the security of data, with levels of protection lagging behind the expansion in volume,
- lack of transparency, which may cause individuals to be subject to decisions that they do not understand or control;
- inaccuracy, discrimination and exclusion, e.g. prejudices and social exclusion perpetuated by decisions based on computer algorithms;
- increased economic imbalance between large corporations and consumers, and
- increased possibilities of government surveillance.
So, what safeguards would make the further use of personal data for analytics compatible?
- The WP makes a distinction between two different scenarios. In the first one, an organisation specifically wants to analyse or predict the personal preferences, behavior and attitudes of individual customers, in order to inform “measures or decisions” that are taken toward them. In this case, free, specific, informed and unambiguous opt-in consent would almost always be required. For the consent to be informed, organisations should disclose their decisional criteria, because, more often than not, the inferences drawn are more sensitive than the information in itself.
The second potential scenario is when the organisation processing the data only wants to detect trends and correlations in the information, without effects on single individuals. In this case, the concept of functional separation is likely to play a key role in deciding whether further use of the data can be considered compatible. “Functional separation” means that data used for statistical or other research purposes should not be available to support measures or decisions that are taken with regard to the individual data subjects. To comply with this requirement, controllers need to take the necessary technical and organisational measures:
- When possible, full anonymisation, including a high level of aggregation, is the most definitive solution.
- Partial anonymization or partial de-identification may be the appropriate solution in some situations when complete anonymisation is not practically feasible. To this end, various techniques—key-coding, keyed-hashing, replacing unique IDs and others—should be used and combined with other safeguards such as data minimization.
- Directly identifiable personal data may be processed only if anonymisation or partial anonymization is not possible without frustrating the purpose of the processing and provided that appropriate and effective safeguards are in place.
Furthermore, the following measures may bring additional protection to the data subjects:
- taking specific additional security measures, such as encryption,
- in case of pseudonymisation, making sure that data enabling the linking of information to a data subject (the keys) are themselves also coded or encrypted and stored separately,
- entering into a trusted third-party arrangement when a number of organisations want to anonymise the personal data they hold for use in a collaborative project;
- restricting access to personal data on a need-to-know basis;
- further processing of personal data concerning health, children or other highly sensitive information should, in principle, be permitted only with the consent of the data subject.
The WP extends its analysis also to open data projects that often involve making entire databases available in standardised electronic format, to any applicant, free of charge and for any commercial or noncommercial purposes under an open licence.
While it is not easy to reconcile the two concerns of unrestricted information reuse and purpose limitation, the WP notes that any information relating to an identified or identifiable natural person, be it publicly available or not, constitutes personal data protected by law. Besides, once personal data are publicly available for reuse, it will be difficult, if not impossible, to have any control on their further use.
Once more, complete anonymisation and a high level of aggregation of personal data is the most definitive solution. However, re-identification of individuals is an increasingly common and present threat, and there is a significant grey area where it is difficult to assess in advance if re-identification may be possible. For this reason, the WP highlights the importance of an effective privacy impact assessment, not monopolis ed by any interested parties, in order to decide what data may be made available for reuse and at what level of anonymisation and aggregation.
Stefano Tagliabue, CIPP/E, CISSP, CISA, works in Telecom Italia’s Privacy Department and has years of experience in managing privacy and information security issues in the telecommunication industry. Stefano co-chairs the IAPP KnowledgeNet in Milan, Italy.
Read more by Stefano Tagliabue:
Garante Defines Obligations for Telecoms and ISPs