By Brian Hengesbaugh, CIPP/US, and Amy de La Lama
The term "Big Data" is relatively new but has already been credited with many accomplishments. A search engine can identify the hotspots for a flu outbreak faster than the Centers for Disease Control by isolating 40-plus key search terms that correlate to the sickness. A major retailer can perform detailed analysis of its sales history to identify the consumer products that it needs to bolster in its inventory three weeks before a hurricane hits. A credit analytics firm can predict which patients may need reminders to take their medicine based on seemingly irrelevant factors such as how long they have lived at the same address, whether they are married, whether they own a car and how long they have been in the same job. An employer can develop sophisticated login activity profiles for each of its employees in order to detect potential data security breaches through aberrant location, time of day, frequency or other login specifics. These are just a few of the examples of Big Data uses.
How Do Privacy Regulations Impact Big Data?
Data privacy laws specifically regulate the collection, use and disclosure of data about an individual and therefore are aimed to restrict information flows such as those associated with Big Data. Several key aspects of data privacy regulations may restrict Big Data analysis, including the following:
- Data subject notice and choice
Virtually all privacy laws establish requirements for the owner or controller of the information (data controller) to provide a privacy notice to the individual data subject concerned (data subject) and, in some cases, to provide the data subject with a privacy choice. Notice is generally required at the time of collection, or sooner, and in a context that the data subject can understand. For example, the Federal Trade Commission (FTC) has issued a report, "Protecting Consumer Privacy in an Era of Rapid Change," which articulates a comprehensive privacy framework for consumer personal data that includes three topline requirements: Privacy by Design, simplified choice and greater transparency. Transparency and simplified choice must be offered at the time of data collection, except that choice is not required where the purposes of use and disclosure qualify as "commonly accepted" practices, such as product fulfillment, legal compliance and first-party marketing.
In many instances, notice and consent may create significant obstacles for Big Data. Big Data focuses on taking data captured in one context for one purpose and reusing it in another context for a secondary purpose. Often, the secondary purpose may not be clearly understood by the business—or at least the privacy compliance team—at the time of collection. Use or disclosure of the data for the secondary purpose may therefore be inconsistent with applicable privacy rules, unless the business undertakes an effort to re-notify and/or re-consent the data subjects. Depending on the context, such attempts at privacy compliance activities may not be feasible after initial data collection or, at a minimum, may generate a "dropout" rate where a portion of data subjects refuse to provide consent or otherwise object. Given that Big Data typically leverages modern analytics and computational power to analyze complete data sets rather than sampling a smaller set of data, privacy requirements of notice and choice may directly impact the effectiveness of such analysis.
- Data Subject Rights: Access, Correction, Matching Procedures and Automated Decisions
Privacy laws confer various rights on data subjects to obtain access to personal data that a business maintains about them and to require correction or other actions regarding such data. For example, the Fair Credit Reporting Act (FCRA) imposes various obligations on "consumer reporting agencies" to provide individuals with access to their own consumer reports, an accounting of disclosures and opportunities to correct or otherwise challenge the accuracy of that information. Businesses that furnish information to consumer reporting agencies, as well as users of consumer reports, are also subject to various obligations to validate or correct information when inaccurate, to provide adverse action notices that inform data subjects about their rights and other obligations. The FCRA's broad definition of "consumer reporting agency" has long been viewed as a trap for the unwary. Generally stated, a "consumer reporting agency" is any business or person that, for fees or other compensation, regularly engages in the practice of assembling or evaluating "non-experience" information about consumers for the purpose of disseminating such information to third parties for use in connection with the evaluation of the consumer for credit, employment, insurance or other "permissible purposes." Data brokers, financial institutions and other data controllers often must carefully structure their operations to avoid becoming a consumer reporting agency.
The EU Data Protection Directive also establishes rights for individuals to not be subject to legally significant decisions based solely on automated processing intended to evaluate performance at work, creditworthiness, reliability or other personal aspects. The Data Protection Directive also establishes, as defined in greater detail in national laws, rights for data subjects to obtain access to personal data, including in some instances an accounting of disclosures and uses, as well as an opportunity to correct and object to the processing of personal data.
Data subject rights such as those described above raise hurdles for Big Data analysis. Among other concerns, Big Data participants may not realize that their entire business model, which is predicated on pulling data from multiple sources to produce valuable reports for their customers, may indeed be squarely and strictly regulated by the FCRA, EU Data Protection Directive provisions on automated decisions or other privacy statutes.
- Disclosures to Third Parties
Privacy laws impose various obligations on data controllers before they disclose personal data to sourcing providers, business associates and other third parties that access or process the information on behalf of the data controllers (data processors) and/or third parties that receive and use the data for their own purposes (third-party data controllers). Generally speaking, data controllers are obligated to impose requirements of nondisclosure and non-reuse on their data processors, as well as obligations to provide appropriate data security and other controls. Data controllers are generally obligated to meet certain thresholds before disclosing personal data to third-party data controllers including, in some cases, obtaining data subject consents.
In a Big Data world, service providers can quickly become inadvertent third-party data controllers. For example, a medical device provider may consider itself a data processor acting on behalf of its customer, a healthcare provider, with regard to the collection and handling of personal data about the patient's use of the device (the primary purpose). However, the medical device provider may become an inadvertent third-party data controller if it utilizes the personal data internally for its own Big Data analysis or provides personal data about usage of its device to other third parties for their use in Big Data analysis, i.e., use or disclosure for secondary purposes. The privacy challenges can be significant if a service provider inadvertently becomes a third-party data controller. Among other concerns, the service provider itself becomes directly subject to substantially more privacy requirements than it likely could have originally anticipated, and the service provider's customer is unlikely to have properly addressed privacy requirements with respect to disclosures to the service provider as a data controller.
- Data De-identification
By definition, privacy laws only apply to information that falls within the scope of personal data. As such, a key issue is whether any personal data at issue can be suitably de-identified such that it falls outside the scope of applicable privacy laws. For example, HIPAA establishes a standard that regulated personal data is de-identified when “there is no reasonable basis to believe that the information can be used to identify an individual.” There are two methods to comply with this standard. The first requires a formal determination by a qualified statistician who determines that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information (HIPAA Statistical Method). The second method involves the removal of 18 specified patient identifiers, including but not limited to patient name, location, e-mail address, telephone number, Social Security number and the like (HIPAA Safe Harbor Method).
Outside U.S., privacy laws can also establish strict standards for anonymization and de-identification. For example, the EU Article 29 Working Party has articulated a standard that whether data has been anonymized requires an analysis that takes into account "all the means likely reasonably to be used by the controller or any other person" to link the data to an individual. In a situation where the data controller deletes names and other identifiers but appends the data with key coding so that data profiles can be compared for Big Data analytics purposes, there still would be an open question of whether a recipient third-party data controller would hold regulated personal data. The analysis would depend on various factors, including risks of an external hack of the original data controller, the likelihood that someone within the data controller’s organization would provide the key and the feasibility of re-identification through indirect means.
In a Big Data world, given the commingling and consolidation of different data sources, the potential for re-identification by third parties is significant. Such re-identification could have cascading consequences regarding privacy compliance across all of the issues cited in this discussion, including notice/choice, data subject rights, data security and breach notification and cross-border data transfers.
- Cross-Border Data Transfer Restrictions
Certain privacy laws include specific restrictions on the cross-border transfer of personal data outside the local jurisdiction to countries that do not maintain adequate or equivalent data protection laws. For example, the EU Data Protection Directive provides that personal data may not be transferred to third countries unless the recipients are located in a jurisdiction that provides adequate protection or certain exceptions are satisfied. In a Big Data world, a data controller may share information with data processors and/or third-party data controllers in non-local jurisdictions, which may in turn flow data downstream to other recipients with whom the data controller has no preexisting privity of contract or other controls—or even knowledge of such disclosures. Such scenarios pose significant challenges for a data controller seeking to address cross-border data transfer restrictions.
What Are the Privacy Solutions for Big Data Initiatives?
Privacy professionals should consider the following for Big Data privacy compliance:
- Stay close to the business teams. Early notification of Big Data plans can help mitigate the chances of direct conflicts between "Big Data" and "Big Data Privacy" by providing privacy professionals with a better opportunity earlier in the process to address applicable privacy laws.
- Enhance privacy notices/consents. Within the constraints of applicable privacy laws, consider enhancing or expanding the language in existing privacy notices and privacy choices to reflect secondary purposes for analytics and related Big Data activities.
- Improve controls on data usage and information management. Privacy compliance for Big Data, including proper actions to address data subject access, accounting of disclosures, correction, erasure and other rights, requires data controllers to maintain rigorous controls over data usage and information management across the full lifecycle of information.
- Enhance policies and procedures on third-party disclosures. Consider adopting or enhancing internal policies to require advice from privacy compliance before disclosure of personal data to data processors as well as third-party controllers.
- Consider de-identification as a strategy. Strategies to manage privacy risks can include de-identification techniques to the extent possible.
- Address cross-border data transfer restrictions. Special care should be taken to address data movement within the group of companies and external disclosures that involve cross-border data transfers.
Brian Hengesbaugh, CIPP, is a principal with Baker & McKenzie in Chicago and a member of the firm's Global Privacy Steering Committee. He focuses on domestic and global data protection and privacy, data security, online, mobile, social media, and e-commerce issues.
Amy de La Lama is Of Counsel in the Chicago office of Baker & McKenzie. She focuses on global and domestic data protection and privacy, including on cross-border, mobile and health privacy issues.