Archive for the ‘Trust’ Category
I often get asked to provide a brief explanation about MIT Enigma — notably what it is, and why it is important particularly in the current age of P2P networking and blockchain technology. So here’s a brief summary.
The MIT Enigma system is part of a broader initiative at MIT Connections Science called the Open Algorithms for Equity, Accountability, Security, and Transparency (OPAL-EAST).
The MIT Enigma system employs two core cryptographic constructs simultaneously atop a Peer-to-Peer (P2P network of nodes). These are secrets-sharing (ala Shamir’s Linear Secret Sharing Scheme (LSSS)) and multiparty computation (MPC). Although secret sharing and MPC are topics of research for the past two decades, the innovation that MIT Enigma brings is the notion of employing these constructions on a P2P network of nodes (such as the blockchain) while providing “Proof-of-MPC” (like proof of work) that a node has correctly performed some computation.
In secret-sharing schemes, a given data item is “split” into a number of ciphertext pieces (called “shares”) that are then stored separately. When the data item needs to be reconstituted or reconstructed, a minimum or “threshold” number of shares need to be obtained and merged together again in a reverse cryptographic computation. For example, in Naval parlance this is akin to needing 2 out of 3 keys in order to perform some crucial task (e.g. activate the missile). Some secret sharing schemes possess the feature that some primitive arithmetic operations can be performed on shares (shares “added” to shares) yielding a result without the need to fully reconstitute the data items first. In effect, this feature allows operations to be performed on encrypted data (similar to homomorphic encryption schemes).
The MIT Enigma system proposes to use a P2P network of nodes to randomly store the relevant shares belonging to data items. In effect, the data owner no longer needs to keep a centralized database of data-items (e.g. health data) and instead would transform each data item into shares and disperse these on the P2P network of node. Only the data owner would know the locations of the shares, and can fetch these from the nodes as needed. Since each of these shares appear as garbled ciphertext to the nodes, the nodes are oblivious to their meaning or significance. A node in the P2P network would be remunerated for storage costs and the store/fetch operations.
The second cryptographic construct employed in MIT Enigma multiparty computation (MPC). The study of MPC schemes seeks to address the problem of a group of entities needing to share some common output (e.g. result of computation) whilst maintaining as secret their individual data items. For example, a group of patients may wish to collaboratively compute their average blood pressure information among them, but without each patient sharing actual raw data about their blood pressure information.
The MIT Enigma system combines the use of MPC schemes with secret-sharing schemes, effectively allowing some computations to be performed using the shares that are distributed on the P2P. The combination of these 3 computing paradigms (secret-sharing, MPC and P2P nodes) opens new possibilities in addressing the current urgent issues around data privacy and the growing liabilities on the part of organizations who store or work on large amounts of data.
Here are the three (3) principles for privacy-preserving computation based on the Enigma P2P distributed multi-party computation model:
(a) Bring the Query to the Data: The current model is for the querier to fetch copies of all the data-sets from the distributed nodes, then import the data-sets into the big data processing infra and then run queries. Instead, break-up the query into components (sub-queries) and send the query pieces to the corresponding nodes on the P2P network.
(b) Keep Data Local: Never let raw data leave the node. Raw data must never leaves its physical location or the control of its owner. Instead, nodes that carry relevant data-sets execute sub-queries and report on the result.
(c) Never Decrypt Data: Homomorphic encryption remains an open field of study. However, certain types of queries can be decomposed into rudimentary operations (such as additions and multiplications) on encrypted data that would yield equivalent answers to the case where the query was run on plaintext data.
One important news item this week from the IoT space is the support by Atmel of Intel’s EPID technology.
Enhanced Privacy ID (EPID) grew from the work of Ernie Brickell and Jiangtao Li based on previous work on Direct Anonymous Attestations (DAA). DAA is very relevant because it is built-in into the TPM1.2 chip (of which there are several hundred million in PC machines).
Here is a quick summary of EPID:
- EPID is a special digital signature scheme.
- One public key corresponds to multiple private keys.
- Private key generates a EPID signature.
- EPID signature can be verified using the public key.
Interesting Security Properties:
- Anonymous/Unlinkable: Given two EPID signatures one cannot determine whether they are generated from one or two private keys.
- Unforgeable: Without a private key one cannot create a valid signature.
So the topic of “trust” always generates a million emails on various lists. Rather than rolling-up my own definition, I thought I’d borrow a good definition from the Trusted Computing Group community (courtesy of Graeme Proudler of HP Labs, UK).
It is safe to trust something when:
- It can be unambiguously identified.
- It operates unhindered.
- The user has first hand experience of consistent, good, behavior.
The definition is that of “technical trust”, namely “trust” in the mechanics of some computation (e.g. cryptographic computation, etc). In this case it refers to the TPM hardware. Note that “unhindered operation” is paramount for technical trust. This is still somewhat of a challenge for software (eg. think multi-tenant clouds and VMs).