IBM researcher Craig Gentry discusses breakthrough
Last month, IBM announced that researcher Craig Gentry made a cryptographic breakthrough with big implications for data privacy. The so-called “fully homomorphic encryption” technology is years away, according to IBM, but its promise for companies that handle and store individuals’ data is anticipated to be profound.
”[Fully homomorphic encryption] would give us new confidence that you can secure the world’s infrastructure and bring intelligence to it without exposing data to greater risks,” the head of IBM’s research arm told Forbes.
Researcher Craig Gentry discusses the breakthrough here.
IAPP: Is the anticipated implementation platform-neutral (i.e., will it support any database structure)?
Gentry: Yes. Abstractly, you can view a ciphertext as a secure box that holds a message. Now, suppose that you have several messages m_1, ...m_t (or several database entries) that are encrypted under the same public key—i.e., the messages are, in some sense, inside the same box. If the encryption scheme is fully homomorphic, then, for any efficiently computable function f, you can use the encryptions of m_1, ..., m_t to compute a ciphertext that encrypts f(m_1, ...m_t) -- i.e., you get f (m_1, ..., m_t) inside the same box—and you can compute this new ciphertext without the secret decryption key.
For example, a user could store encryptions of m_1, ..., m_t on a database server. Suppose that the user later wants some function of its data—i.e., f(m_1, ..., m_t). This function f might be a JOIN query, or it might output all of the entries containing some substring, ... whatever. Using the fully homomorphic encryption scheme, the server can compute an encryption of f(m_1, ..., m_t), send that ciphertext to the user, and the user decrypts it to recover the desired information.
Again, as long as you can compute the function f reasonably efficiently, then you can compute the same function over encrypted data (getting an encrypted result), though computing the function over encrypted data takes much longer. In particular, the scheme is independent of the data structures used. (In contrast, most likely the function f that you want to compute would be dependent on the data structure.)
IAPP: Does the current implementation encrypt at the table, column, or field level, or are these reference models inapplicable?
Gentry: There isn’t an ”implementation” yet; currently, there is only a mathematical description of the scheme.
The reference models are inapplicable, as stated above. Any program or database-querying system could be ”compiled” into an (encypted) program or (encrypted) system that has encrypted input and output.
IAPP: How does it compare to other fully homomorphic encryption schemes? What makes it different?
Gentry: There aren’t any other fully homomorphic encryption schemes. A bit more accurately, and to get a bit technical, there are previous schemes but their performance (running time) depends *exponentially* on the function f that you are computing, whereas the running time of my scheme depends only *polynomially* on the function f. In other words, to put it simply, even though my current scheme is rather slow, previous schemes would be much, much slower—ridiculously slower—for mildly complex functions f.
IAPP: A Forbes article on your scheme mentioned that the current application is cumbersome, and that years of newly developed efficiencies will be required to make it commercially viable. Can the technology be implemented (albeit, inefficiently) on typical current datawarehouse hardware, or are you anticipating that hardware advances over the coming years would make this viable? Is the sense that optimization of the algorithm will improve performance, or that computing hardware will advance to make this unnecessary?
Gentry: I’m optimistic that, over the next couple of years or so, researchers will find ways to make the algorithms themselves significantly faster, and that the algorithmic speed-up will be much larger than the hardware speed-up over this period. I think it makes sense to find this low-hanging fruit before commercializing it.
It’s dangerous to speculate on when it will be ready to be used on databases...
One issue is that, although part of the appeal of my scheme is that it is ”fully” homomorphic in the sense that it can handle arbitrary functions f, my scheme performs significantly better if the complexity of the function f falls below a certain threshold. Once the complexity of f goes above a certain threshold, I need to use a technique that I call ”bootstrapping,” a computationally-intensive operation in which I ”refresh” the intermediate ciphertexts on my way to computing the eventual output ciphertext (an encryption of f(m_1, ..., m_t)). However, if the complexity of f falls below the threshold, I can get by without using bootstrapping, and the scheme is much closer to being practical for such functions. I would imagine that relatively simple functions, such as a JOIN operation in a database query, would fall into this below-the-threshold category.