Network of Excellence in Internet Science

Session 1: Privacy

Samir Passi Royal Netherlands Academy of Arts and Science (KNAW) ‘Slide to Unlock?’ – Mobile convergence and collapsing contexts

This presentation will highlight privacy issues raised by increasing access to social networks made possible by various mobile applications. I will focus on the unintended consequences of the ability of third-party apps to interact not only with the online databases and services of social networks but also with a user’s personal data within the mobile device itself. The content of the presentation is based on the review work that I am currently doing for the EINS JRA 5.1.1 deliverable (Analysis of Privacy, Reputation, and Trust in Social Networks) and relates to the disciplines of Science and Technology Studies (STS) and Information and Communication Technology (ICT).

Online social networks – which primarily started out as web services – have now evolved into social platforms that not only serve individual users but also offer developers the means to interact with the platform. Social networks such as Facebook, Foursquare, and Google+ provide programming interfaces that developers can use to build applications that can interface and interact with the platform’s data and services. Depending upon the nature of the network, these third-party applications can then generate novel means to catalogue, classify, and correlate information pertaining to the entire user base of multiple social platforms. A famous example is the TweetDeck application that allows its users to simultaneously interact with Facebook and Twitter.

With the widespread diffusion of smartphones and tablets, such ability for novel and large-scale convergence of social information has implications for the sociology of user expectations concerning user information privacy. Through their mobile variants, these applications can scan a user’s contacts, messages, mobile photos, and location in addition to information from various social platforms. This sometimes leads to situations where the ability of these apps to ‘use’ the gathered data has unintended consequences. An oft-cited example of this was ‘Girls around me’. Through this app, a person could search around his/her location for nearby girls. The app took public data from Foursquare and coupled it with the public images of girls on Facebook to provide the user with an interactive map displaying a comprehensive visualization of information pertaining to girls around his/her location. Although the app was subsequently taken down, the example clearly depicts how third-party social applications can have consequences for societal notions of privacy and trust by facilitating novel means of large-scale tagging, identifying, and converging not only online information but also the exact locations of mobile users.

An in-depth understanding of public and private contexts in relation to characteristics particular to the mobile medium provides a relevant point of entry to examine such privacy and trust issues. Although Facebook photos and Foursquare check-ins might have separately been made public by certain users, the combination of the two coupled with an exact location on the map is certainly not what these girls explicitly consented to. By identifying and merging particular bits of scattered information, apps such as ‘Girls around me’ facilitate the collapsing of public and private contexts and pose a substantial threat not only to an individual’s privacy and personal security but also to socially acceptable forms of data mining.

Moreover, although such apps can be regulated on standardized app-stores provided by Google or Apple, the ease of working with social and mobile platforms makes it increasingly difficult to manage and govern the intentionality of the large number of mobile apps that are developed each day. Social networks and mobile devices have now become ubiquitous tools that are used by individuals to manage their everyday lives and mobile app development has become a substantial market in itself. In such a scenario, it is imperative to examine the implications of the ability of third-party applications to facilitate the large scale convergence of user information in ways that are quite novel and non-traditional. In a time when ‘privacy as contextual integrity’ and ‘privacy by design’ are issues that are featured prominently on the societal agenda, this presentation will provide insights into questions such as what contextual integrity translates to for the increasingly ubiquitous mobile medium or what must we know before we start designing privacy into mobile apps and social platforms?

Rayman Preet Singh, S. Keshav and Tim Brecht University of Waterloo PEDE: A Cloud-Based Personal Execution and Data Hosting Environment

Increasing amounts of data are being generated and collected by, on behalf of, and about individuals.

Some of this data is generated by traditional applications and services like document processors, e-mail, media-sharing services, web browsers, instant messaging, and social networking services. Other emerging sources of data include devices that act as sensors to record data such as smart metering, heath-care monitoring, smartphone-based sensing, and monitoring of individuals' banking and shopping activities. Most often, data is collected by service providers who take ownership and full control of the data – thereby risking users privacy - in exchange for free services. There is a growing discomfort among consumers about relying on the service providers' changing privacy policies, losing data privacy and control, and having to trust these services. This is evident from dissent against leading social networking and media sharing services, and cases of serious user resistance to the installation of smart meters collecting energy consumption data. These concerns are not without warrant as recent research has demonstrated that such data can be mined to reveal private information about users. For instance, energy consumption data from smart meters can be used to determine occupancy, appliance use, and even the TV channel being watched! Other forms of user data such as messaging, photos, videos, location, health statistics, spending activities are unarguably private in nature and their collection by service providers poses new threats to user privacy. However, keeping user data completely private makes it impossible to make data driven recommendations to users that could benefit them. Our goal is to build an environment that balances data privacy and data analytics.

We propose constructing a framework in which users place their data at a universally accessible location that they own and control individually. A user's data resides in the cloud within her Personal Execution and Data Environment (PEDE) which provides reliable storage for hosting the data, and computation to run applications on the data. The use of modern clouds for hosting PEDEs relieves the user of the problems of warehousing the data, its accessibility, computation resources for its processing, and its consolidation from multiple sources, which arise when commodity devices are used for the purpose. Users download applications to their PEDEs and they interface with users' data and other services a PEDE may offer. Being a PEDE owner, a user can configure applications' access to the data (and services) and can impose her own privacy policies, enabling a privacy-preserving application ecosystem for the data, which remains under her purview at all times. In this ecosystem, third party developers build applications that process the data, generating meaningful results for the user, enhancing data value, while fully respecting users' data ownership, privacy and control. Much like app stores for mobile devices that have enriched the user experience, such an ecosystem would enable innovation in data processing which is currently frozen because of user data being locked with service providers.

We are studying a cloud-based architecture that uses PEDEs to allow users to provide third parties with fast, consolidated, universal, and privacy-preserving access to their data while retaining complete ownership and control of the data. We build example applications to demonstrate how appliance vendors, energy auditors, and other third parties can develop consumer applications using our platform while preserving consumer privacy. Possible use cases of this platform include: applications performing detailed analysis, tailored to individual users, for quantifying benefits of purchasing energy efficient appliances, and for helping users better understand and control their energy consumption.

Prior work has recognized the problem of data privacy and offered theoretical advances, such as differential privacy and homomorphic encryption. It provides protocols that protect the privacy of the data while enabling computations on that data. Unfortunately, prior work does not describe systems to enable application development and deployment. Our work is unique in that it leverages the rich infrastructure of modern clouds to provide an environment for the implementation of these algorithms.

Yves-Alexandre de Montjoye, Alex “Sandy” Pentland MIT Media Lab Protecting thePrivacy of Personal Datathrough Change of Ownership

Personal data—digital information about users' location, web-searches, and preferences—is undoubtedly the oil of the new economy. However, the same smart algorithms which conveniently advise you on the next movie you should watch or restaurant you should eat at can also infer more than you might want to from seemingly harmless data.

Our contribution is two-fold. First, we argue that as soon as personal data becomes rich enough, it cannot be generically anonymized without severely limiting its uses. Second, we introduce openPDS, a privacy-preserving personal data store. openPDS allows for generic, on-the-fly uses of personal data while protecting user privacy. Such a user-centric model defines a new paradigm for protecting internet privacy.

In this work, we review existing privacy literature and de-anonymization methods with a focus on high-dimensional data and, more particularly, on location data. We argue that there are no privacy preserving methods that anonymize the data a priori for a broad range of uses. Such limitations make it essential for users to control their personal data. A change of data ownership has thus been proposed by the National Strategy for Trust Identities in Cyberspace, The Department of Commerce Green Paper, the Office of the President’s International Strategy for Cyberspace, and the European Commission's 2012 reform of the data protection rules.

We introduce openPDS, an implementation of this new ownership model through a personal data store owned and controlled by the user. openPDS supports the creation of smart, data-driven applications while protecting the privacy of users’ personal data. As openPDS allows for third-party applications to be installed, sensitive data processing can take place within a user's PDS through a secure question answering framework. This framework allows the dimensionality of the data to be reduced on a per-need basis before being anonymously shared. Unlike existing methods, such a privacy-preserving scheme does not require access to the whole database. openPDS also engages in privacy-preserving group computation to aggregate data across users. This framework simplifies a lot of the traditional security problems such as broad query restrictions and abuses, security of cloud storage, or reputation and trust systems. Our initial deployment monitored through smartphones the daily behavior of a set of individuals with diagnosed mental problems and offered a first qualitative evaluation of the system. A large-scale deployment will start in Trento Italy in November in partnership with Telecom Italia.

As technologists and scientists, we are convinced that there is an amazing potential in the use of personal data, but also that benefits should be balanced with risks. By reducing the dimensionality of the data on the fly or by anonymously just answering questions, openPDS opens up a new way for individuals to regain control over their privacy, while allowing them to unlock the full value of their data.