Professor Joan Feigenbaum: Research-wise, I think what’s most interesting right now is “cryptographic computing.” The goal is to be able to compute on encrypted data, or more generally, to be able to perform a computation on sensitive data without actually revealing the data to the computer that is performing the computation.
Vivek Gopalan: Is that the same as Zero-Knowledge Proofs?
JF: Zero-knowledge proofs are similar but not exactly the same thing. In a zero-knowledge protocol, there is a prover and a verifier who have a common input x. The prover claims that x has property P. If that’s a true claim, then they can convince the verifier that it’s true just by following the protocol. If they’re making a false claim, then they cannot convince the verifier that it’s true even if they cheat and deviate from the prescribed protocol. What’s remarkable and completely novel about these protocols is that the only knowledge that the verifier gains by executing the protocol with an honest prover is the fact that x has property P; nothing about the actual proof is revealed. But note that the goal is to hide the proof, not to hide the data. Both prover and verifier have full access to the input x.
VG: So what’s cryptographic computing?
JF: Cryptographic computing is not a question of hiding proofs. It’s a question of hiding data. Suppose I want to do a database query. I’m using a cloud-computing service and have stored my relational database in the cloud. Can I actually store encrypted data, and make queries against it, in such a way that the cloud-service provider does not have to decrypt first?
Cryptographic computing is a well developed research area. One of the foundational papers was published by my PhD advisor Andy Yao back in 1982 when I was a second-year graduate student. Remember that, when I first found out about RSA encryption, I said “Wow. This is so clever.” I expected to be using it the next day, and that was in 1980. I still had fifteen or twenty years to go before it was in widespread use. Same thing with this “computing on encrypted data” idea. I saw Yao’s paper, and I thought, “Wow. This is so clearly useful.” But it’s been in development for a very long time, but only now is it finally getting looked at seriously.
I gave you the example of a database stored by a cloud-computing service. Here’s an example of something I personally worked on. We were talking about surveillance and how many people think there’s just too much it, including by governments who claim that they’re surveilling us for our own good. Here’s an example of how we could use cryptographic-computing protocols to cut down on that. What’s that place out in the west where four states meet?
VG: The Four Corners?
JF: The Four Corners – yeah. In three of those four mountain states, there was a series of bank robberies by what law enforcement called the High Country Bandits. Old-fashioned bank robberies, you know – the Bandits walked in, pointed guns at the tellers, and said, “give us the cash.” The FBI got involved, because this was an interstate violent crime. They wound up catching the Bandits, because somebody who was in the bank during one of the robberies said that he saw one of the Bandits using a cellphone.
Thinking that he may have used the same cellphone in every robbery, they got cell-tower dumps from the closest cell towers for the times of all three robberies, intersected them, and discovered that there was just one number in the intersection. They concluded that that was probably the robber’s number; they got the name and address from the cell-phone provider and arrested him, after which he confessed and ratted out his friends. Most people to whom I tell that story say “Oh wow! That’s a great use of technology.” Then I point out that there’s a problem: In the process of doing this high-tech investigation, the FBI collected the phone numbers of a total of about 150,000 people, all but one of whom were totally innocent. If they’d done an analogous investigation in an urban area, they’d have collected millions of phone numbers of innocent people. I don’t think that we should have personal data like “who was using what cellphone tower” collected by law enforcement if it’s avoidable.
But is it avoidable? Yes! There are well studied, practical protocols for computing the intersection of sets of encrypted data. We could have the phone companies provide to law enforcement encrypted cell-tower dumps: phone numbers and metadata of all the calls handled by the relevant cell towers during the relevant time windows, in encrypted form. Using a privacy-preserving set-intersection protocol, the law-enforcement agency could decrypt only the phone numbers in the intersection; no information would be revealed about the rest of the phone numbers that used those towers at the relevant times, because they would remain encrypted. In the High Country Bandits example, the FBI would have decrypted only the phone number of the Bandit whose phone used all three towers at the relevant times, and it would have learned nothing about the phone numbers of the innocent bystanders.
VG: Would all cell towers then need the same [cryptographic] key?
JF: No, but they would need to have foresight and to use a particular encryption scheme. They would also have to format all of their data for this particular privacy-preserving set-intersection protocol. That wouldn’t be hard.
VG: How else could cryptographic computing be used?
JF: It is potentially very useful in cloud computing. Ultimately, I believe that there will be many customers and potential customers of cloud-computing services who worry about exposing sensitive data to the cloud providers. Even if they’re not worried, they may be under contractual obligation to their customers not to expose confidential data to third parties – even third-party cloud-service providers whom they trust.
Moreover, emerging data-protection regulations can make it very difficult to move data across national boundaries or to share it with third parties. Data collectors who need to do sophisticated computations will either have to do everything in house, which is not an attractive option or even a feasible one for many organizations, or have to outsource computation in a provably privacy-preserving manner. Cryptographic computing is the only body of technique that addresses these emerging needs.
Fortunately, it’s not just I who thinks this. DARPA and IARPA – the US military and intelligence research-funding agencies – want to use cryptographic computing and have funded a lot of research in the area over the last 10 years. There are also a number of promising start-up companies developing products and services.
That’s my mission for the next phase of my career. I want cryptographic computing to be widely deployed. Hundreds of papers have been written about it, and I think that they’re totally convincing about the value of this stuff. It’s not just a technology – it’s a mindset. People should instinctively know that there are privacy regulations we must comply with and that our own data are valuable. Whenever we can do important computations on encrypted data, we should.