Open Source Data: Big Data for All
Think about it: Wouldn’t you love to know everything your cellphone knows?
I mean, not just the stuff about the universe—like the distance between Des Moines and Billings or the weather in Ulaanbaatar, Mongolia—but also information about you. Like the time you wake up in the morning, your movements around town, when and where you tend to get stuck in traffic, how much exercise you are getting, what you are eating, your online clickstream, social networking activities, communications, contacts, calendar and more. If only you could tap into this information, analyze it and draw useful conclusions, you could no doubt improve your effectiveness and quality of life.
Over the past decade, the privacy framework has become preoccupied with organizational data management processes grouped under the title “accountability.” While improving corporate governance and mitigating data security risks (no doubt admirable goals), accountability measures generate little benefit to individuals. Indeed, by treating organizations as trusted stewards of personal information, accountability cuts individuals out of the decisionmaking process. You want privacy? Walmart or Pfizer will take care of it for you.
In a new article, Big Data for All: Privacy and User Control in the Age of Analytics, which will be published in the Northwestern Journal of Technology and Intellectual Property, Jules Polonetsky, CIPP/US, and I try to refocus the privacy framework on individual empowerment. We argue that going forward, organizations should provide individuals with practical, easy-to-use access to their information, so they can become productive participants in the data economy. In addition, organizations should be transparent about the decisional criteria underlying their data processing activities, allowing individuals to challenge, or at the very least understand, how decisions about them are made.
First, we propose a “sharing the wealth” strategy premised on organizations providing individuals with access to their data in usable format. This, in turn, will spawn the development of user-side applications that analyze individuals’ data to draw useful conclusions (“take the I-90”; “eat more proteins”; “call Sally”). We call this the “featurization” of Big Data, taming the Big Data “beast”, which is currently used strictly for server-side surveillance and harnessing it for domestic use. The technological groundwork has already been set with mash-ups and real-time APIs making it easier for organizations to combine information from different sources and services into a single user experience. Much like open source software or creative commons licenses, free access to personal data is grounded in both efficiency and fairness rationales.
Second, we suggest that organizations should be more forthcoming about the criteria used in their data-driven choices. Data analysis machinery is increasingly used to make choices that affect individuals’ lives. Will you get credit or insurance? Which medical treatments suit you best? Which ad will you be shown? The machine, the thinking goes, is more impartial, objective and efficient than any individual referee.
Yet the machine is also heartless.
Consider Bettina Wulff, the thirty-something-year-old wife of former German President Christian Wulff. When you Google her name, the search engine’s autocomplete function adds terms like “escort” and “prostitute.” Wulff, who has sued Google for defamation, vehemently denies these machine-made charges.
Yet who can argue against the machine? Who do you believe—Ms. Wulff or the Google algorithm?
The machine also lacks moral judgment.
Consider the “pregnancy score” allocated by Target to shoppers in order to preempt competitors with early detection of mothers-to-be. Jules and I argue that as “objective” as the machine may be, it must be tempered by a human conscious. Individuals should not be judged by machines operating based on opaque criteria that are hidden behind veils of trade secrecy. We are entitled to know, challenge and debate the merits of data-driven decisions. And while we recognize the practical difficulties of mandating disclosure without compromising organizations’ “secret sauce”, we trust that a distinction can be drawn between proprietary algorithms, which would remain secret, and decisional criteria, which should be disclosed.
The Big Data economy generates immense benefits to organizations, individuals and society at large. In order to preserve the value of data while protecting privacy rights, individuals should be granted a voice as well as a piece of the action.
About the Author
Omer Tene is Vice President of Research and Education at the IAPP where he administers the Westin Fellowship program and fosters ties between the industry and academia. He is also Vice Dean of the College of Management School of Law, Rishon Le Zion, Israel; an Affiliate Scholar at the Stanford Center for Internet and Society; and a Senior Fellow at the Future of Privacy Forum. He has published extensively in US and European law reviews about big data, online tracking, and international privacy law.