(NB. I love it when people get it. Hal Hodson definitely gets it. Many folks at the MIT-KIT conference this week got it.)
03 October 2013 by Hal Hodson
Magazine issue 2937.
Software like openPDS acts as a bodyguard for your personal data when apps – or even governments – come snooping
Editorial: “Time for us all to take charge of our personal data”
BIG BROTHER is watching you. But that doesn’t mean you can’t do something about it – by wresting back control of your data.
Everything we do online generates information about us. The tacit deal is that we swap this data for free access to services like Gmail. But many people are becoming uncomfortable about companies like Facebook and Google hoarding vast amounts of our personal information – particularly in the wake of revelations about the intrusion of the US National Security Agency (NSA) into what we do online. So computer scientists at the Massachusetts Institute of Technology have created software that lets users take control.
OpenPDS was designed in MIT’s Media Lab by Sandy Pentland and Yves-Alexandre de Montjoye. They say it disrupts what NSA whistleblower Edward Snowden called the “architecture of oppression”, by letting users see and control any third-party requests for their information – whether that’s from the NSA or Google.
If you want to install an app on your smartphone, you usually have to agree to give the program access to various functions and to data on the phone, such as your contacts. Instead of letting the apps have direct access to the data, openPDS sits in between them, controlling the flow of information. Hosted either on a smartphone or on an internet-connected hard drive in your house, it siphons off data from your phone or computer as you generate it.
It can store your current and historical location, browsing history, content and information related to sent and received emails, and any other personal data required. When external applications or services need to know things about you to provide a service, they ask openPDS the question, and it tells them the answer – if you allow it to. People hosting openPDS at home would always know when entities like the NSA request their data, because the law requires a warrant to access data stored in a private home.
Pentland says openPDS provides a technical solution to an issue the European Commission raised in 2012, when it declared that people have the right to easier access to and control of their own data. “I realised something needed to be done about data control,” he says. “With openPDS, you control your own data and share it with third parties on an opt-in basis.”
Storing this information on your smartphone or on a hard drive in your house are not the only options. ID3, an MIT spin-off, is building a cloud version of openPDS. A personal data store hosted on US cloud servers would still be secretly searchable by the NSA, but it would allow users to have more control over their data, and keep an eye on who is using it.
“OpenPDS is a building block for the emerging personal data ecosystem,” says Thomas Hardjono, the technical lead of the MIT Consortium for Kerberos and Internet Trust, a collection of the world’s largest technology companies who are working together to make data access fairer. “We want people to have equitable access to their data. Today, AT&T and Verizon have access to my GPS data, but I don’t.”
Other groups also think such personal data stores are a good idea. A project funded by the European Union, called digital.me, focuses on giving people more control over their social networks, and the non-profit Personal Data Ecosystem Consortium advocates for individuals’ right to control their own data.
OpenPDS is already being put to use. Massachusetts General Hospital wants to use the software to protect patient privacy for a program called CATCH. It involves continuously monitoring variables including glucose levels, temperature, heart rate and brain activity, as well as smartphone-based analytics that can give insight into mood, activity and social connections. “We want to begin interrogating the medical data of real people in real time in real life, in a way that does not invade privacy,” says Dennis Ausiello, head of the hospital’s department of medicine.
OpenPDS will help people keep a handle on their own data, but getting back information already in private hands is a different matter. “As soon as you give access to that raw data, there’s no way back,” says de Montjoye.
So its only 2 weeks away to annual conference. Its beefing-up to be a solid conference, with some stellar speakers. Really excited about it!
So this is getting very interesting: The world’s largest chip maker wants to see a new kind of economy bloom around personal data (article here).
It looks like Intel is entering into the personal data & big data narrative. Given that Intel owns a considerable chunk of the motherboard & SoC real-estate (think Processors, discrete TPMs, AMT, etc. etc), they do indeed have access to the plumbing of my machine.
One question is whether hardware and chipset providers will begin to require end-users to agree to Terms of Service (allowing them to access data bits moving around the board). Such a move would complicate the user’s life. A typical person would then be forced to accept TOS and EULAs at three layers (at least):
- The hardware layer.
- The OS level (think EULAs)
- The application layer (think EULAs when installing Office productivity tools)
- The Services Provider (SP) and IdP layer (think Click-Thru agreement when signing-up to accounts)
Eve Maler kindly prepared an excellent set of slides for me to present at IIW#16 in Mountain View, CA late April: UMA_for_IIW16_2013-05
After discussions during the presentation, I believe one of the technical issues that still causes confusion is the fact that UMA uses three (3) distinct OAuth2.0 Tokens:
- AAT Tokens: Authorization API Token — this OAuth2.0 token is used by the Client to prove (to the Authorization Server) that it has authorization to access the APIs at the Authorization Server.
- PAT Tokens: Protection API Tokens — this OAuth2.0 token is used by the Resource Server to prove (to the Authorization Server) that the Resource Owner (e.g. Alice) has provided it (the Resource Server) with authorization to register Alice’s resource at the Authorization Server.
- RPT Tokens: Requesting Party Tokens — this OAuth2.0 token provides authorization for the Requesting Party to access resources at the Resource Server.
Here are the key take-aways:
- All three tokens are OAuth2.0 tokens.
- All three tokens are issued by the Authorization Server (or what used to be called the Authorization Manager in UMA).
- All three tokens should ideally be used in conjunction with the relevant parts of the UMA Binding Obligations (BO) spec. The BO spec tells the parties involved what their legal obligations will be.
Ray Campbell hits the ball out of the park again with his awesome suggestion in his blog: we need a HIPAA-like regime for the privacy of personal data. As a mental exercise, Ray has gone through the HIPAA document and substituted “individually identifiable health information” to “individually identifiable personal information“. The red-lined doc can also be found on his site.
The at the heart of his proposal is the notion of shifting the thought paradigm from the person as the absolute owner of his/her personal data to one where the person is seeking the right to know about who has his/her personal data, how they obtained it, what are they doing with it and to whom have they sold the data (the 4 questions).
Following on from Ray’s post and from Professor Sandy Pentland’s view on the New Deal on Data, I believe there should be a new market in the digital economy where individuals can meet directly with buyers of their personal data, and where individuals can opt-in to make more data about themselves available to these buyers. Cut out the middleman — the big data corporations that are not contributing to the efficiency of free markets.
People ask me all the time about the vision of the IDESG. The following provides a very useful summary (from the original NPO document):
“Individuals and organizations utilize secure, efficient, easy-to-use, and interoperable identity solutions to access online services in a manner that promotes confidence, privacy, choice, and innovation.”
Identity Solutions will be:
- Privacy-enhancing and voluntary
- Secure and resilient
- Cost-effective and easy to use
Today at the 3rd Plenary of the IDESG, the Chair of the IDESG (Bob Blakley) presented a high level vision slide of what the IDESG should be working on. Its a very good slide for the purposes of uniting the work of the IDESG. Each industry area (or stakeholder group) would end-up with its own Trust Framework Provider that covers IdPs in that space, and users and RPs.
MIT Media Lab – 2013 Legal Hack-a-thon on Identity
Today we had the privilege of hearing a presentation by Loius Wingers and Stefan Treatman-Clark on a couple of lightweight ciphers from the NSA. These are called SIMON and SPECK. The algorithms are not yet published, but they have a paper (pdf copy here) that shows some numbers on the performance of the proposed ciphers.
The SIMON and SPECK algorithms come in a family that range from 48-bits to 128-bits. Since the target deployment area is low-power and low memory devices (i.e. RFID devices, etc), the requirement is that these algorithms do not use more than 2000 gates. The paper has a table showing the GE and throughput.
Louis and Stefan presented SIMON and SPECK at the 2013 Legal Hack-a-Thon and the MIT Media Lab.
MIT Media Lab – 2013 Legal Hack-a-thon on Identity
Ray Campbell argues quite elegantly and convincingly that the “data ownership” paradigm is not the correct paradigm for achieving privacy and control over personal data. The notion that “I own my data” can be impractical especially in the light of 2-party transactions, where the other party may also “own” portions of the transaction data and where they might be legally bound to keep copies of “my data”.
Instead, the better approach is to look at “transparency” and visibility into where our data reside and who is using it. Here are the four questions that Ray poses:
- Who has my data
- What data they have about me
- How did they acquire my data
- How are they using my data
Transparency becomes an important tool disclosure management of personal data. These questions could be the basis for the development of a trust framework on data transparency, one which can be used to frame Terms of Service that both myself and the Relying Party must accept.