Rachel is a student at a US university who was sexually assaulted on campus. She decided against reporting it (fewer than 10% of survivors do). What she did, however, was register the assault on a website that is using novel ideas from cryptography to help catch serial sexual predators.
The organization Callisto lets a survivor enter their name in a database, together with identifying details of their assailant, such as social media handle or phone number. These details are encrypted, meaning that the identities of the survivor and the perpetrators are anonymous. If you hacked into the database, there is no way to identify either party.
However, if the same perpetrators is named by two people, the website registers a match and this triggers an email to two lawyers. Each lawyer receives the name of one of the survivors (but not the name of the perpetrators). The lawyers then contact the survivors to let them know of the match and offer to help coordinate any further action should they wish to pursue it.
In short, Callisto enables the survivors of sexual assault to do something unprecedented: they can discover if their abuser is a repeat offender without identifying themselves to the authorities or even identifying the name of the abuser. They have learned something useful, and possibly helpful, without having given anything away. “Survivors can find it healing to know they are not the only one. They don’t feel it is their fault,” says Tracy DeTomasi, Callisto CEO. And there is strength in numbers. “Maybe one person doesn’t have a case, but two people do.”
The ability of two strangers to pool their knowledge without revealing any personal information to each other is a seemingly paradoxical idea from theoretical computer science that is fueling what many are calling the next revolution in tech. The same theory enables, for example, two governments to discover if their computer systems have been hacked by the same enemy, without either government disclosing confidential data, or two banks to discover if they are being defrauded by the same person, without either bank breaking financial data protection laws.
The umbrella term for these new cryptographic techniques, in which you can share data while keeping that data private, is “privacy-enhancing technologies”, or Pets. They offer opportunities for data holders to pool their data in new and useful ways. In the health sector, for example, strict rules prohibit hospitals from sharing patients’ medical data. Yet if hospitals were able to combine their data into larger datasets, doctors would have more information, which would enable them to make better decisions on treatments. Indeed, a project in Switzerland using Pets has since June allowed medical researchers at four independent teaching hospitals to conduct analysis on their combined data of about 250,000 patients, with no loss of privacy between institutions. Juan Troncoso, co-founder and CEO of TuneInsight, which runs the project, says: “The dream of personalized medicine relies on larger and higher-quality datasets. Pets can make this dream come true while complying with regulations and protecting people’s privacy rights. This technology will be transformative for precision medicine and beyond.”
The past couple of years have seen the emergence of dozens of Pet startups in advertising, insurance, marketing, machine learning, cybersecurity, fintech and cryptocurrencies. Governments are also getting interested. Last year, the United Nations launched its “Pet Lab”, which was nothing to do with the welfare of domestic animals, but instead a forum for national statistical offices to find ways to share their data across borders while protecting the privacy of their citizens.
Jack Fitzsimons, founder of the UN Pet Lab, says: “Pets are one of the most important technologies of our generation. They have fundamentally changed the game, because they offer the promise that private data is only used for its intended purposes.”
The theoretical ideas on which Pets are based are half a century old. In 1982, the Chinese computer scientist Andrew Yao asked the following question: is it possible for two millionaires to discover who is richer without either one revealing how much they are worth? The counterintuitive answer is that, yes, it is possible. The solution involves a process in which the millionaires send packets of information between each other, using randomness to hide the exact numbers, yet at the end of it, both millionaires are satisfied that they know who is the richer, without either of them knowing any other details of the other one’s wealth.
Yao’s “millionaires problem” was one of the foundational ideas of a new field in cryptography – “secure multiparty computation” – in which computer scientists investigated how two or more parties could interact with each other in such a way that each party kept important information secret and yet all were able to draw meaningful conclusions from their pooled data. This work led in the mid-1980s to a flourishing of increasingly mind-bending results, one of the most dazzling being the “the zero knowledge proof”, in which it is possible for a person to prove to someone else that they have some secret information without revealing any information about it! It allows you, say, to prove that you have solved a sudoku without having to reveal any details of your solution. Zero-knowledge proofs involve a process, as with the millionaires problem, in which the prover sends and receives packets of information in which crucial details are obfuscated with randomness.
HASother valuable instrument in the Pet toolbox is “fully homomorphic encryption”, a magical procedure often called the holy grail of cryptography. It enables person A to encrypt a dataset and gives it to person B, who will run computations on the encrypted data. These computations provide B with a result, itself encrypted, which can only be decrypted once passed back to A. In other words, person B has performed analytics on a dataset while learning nothing about either the data or the result of their analytics. (The principle is that certain abstract structures, or homomorphisms, are maintained during the encryption process.) When fully homomorphic encryption was first mooted in the 1970s, computer scientists were unsure it would even be possible and it was only in 2009 that the American Craig Gentry demonstrated how it could be done.
These three groundbreaking concepts – secure multiparty computation, zero-knowledge proofs and fully homomorphic encryption – are different ways that data can be shared but not revealed. In the 1980s, during the early years of research, cryptographers were not thinking that these innovations might have any practical uses, in large part because there were no obvious real-world problems to which they were a solution.
Times have changed. The world is awash with data, and data privacy has become a hugely contentious political, ethical and legal issue. After half a century in which Pets were essentially arcane academic games, they are now seen as a solution to one of the defining challenges of the digital world: how to keep sensitive data private while also being able to extract value from that data.
The emergence of applications has driven the theory, which is now sufficiently well developed to be commercially viable. Microsoft, for example, uses fully homomorphic encryption when you register a new password: the password is encrypted and then sent to a server who checks whether or not that password is in a list of passwords that have been discovered in data breaches, without the server being able to identify your password. Meta, Google and Apple have also over the last year or so been introducing similar tools to some of their products.
In addition to new cryptographic techniques, Pets also include advances in computational statistics such as “differential privacy”, an idea from 2006 in which noise is added to results in order to preserve the privacy of individuals. This is useful in applications such as official statistics, where simple averages can reveal private information about people coming from minority groups.
Much of the recent investment in Pets has come from cryptocurrencies. Earlier this year, crypto-exchange Coinbase spent more than $150m to buy Unbound Security, a multiparty computation startup co-founded by Briton Nigel Smart, professor of cryptography at KU Leuven in Belgium. “In the blockchain space, multiparty computation is now everywhere,” he says. “In the last year it has gone from ‘will this work?’ to being standard.”
He believes Pets will eventually spread across the entire digital ecosystem. “This is the future. It is not a fad. What this tech allows you to do is collaborate with people you wouldn’t have thought of collaborating with before, either because it was legally impossible to do so, or because it wasn’t in your business interest, since you would have been revealing information . This opens up new markets and applications, which we are only just starting to see. It’s like in the early days of the internet, no one knew what apps would come along. We are in the same situation with Pets.
“I think it is becoming more and more intrinsic. You see it everywhere. All data will eventually be computed with privacy-enhancing tech.”
The current applications of Pets are niche, partly because the technology is so new, but also because many people are unaware of it. Earlier this year, the UK and US governments jointly launched a £1.3m prize for companies to come up with ideas to “unleash the potential of Pets to combat global societal challenges”.
Yet some uses are already having an effect, such as Callisto. DeTomasi says that 10-15% of survivors who have used the site have had matches, meaning that their assailants have numerous victims. DeTomasi does not know the names of any survivors with matches, or the names of the assailants, since the system keeps them secret. (The “Rachel” mentioned in the introduction is an invented name for the purposes of illustration.)
DeTomasi does say, however, that 90% of sexual assaults on campuses are by serial offenders, who on average will perpetrate six times during their college year. “So if we stop them after two times, we are preventing 59% of assaults from occurring.” Callisto is currently available at 40 universities in the US, including Stanford, Yale, Notre Dame and Northwestern, and the plan is to roll it out to all universities. “It is definitely needed,” she adds, “and it is definitely working.”
The secret life of Pets
Four of the most important privacy-enhancing technology
Secure multiparty computation
Allows two or more parties to compute on their shared data, without any party revealing any of their private data.
Allows a person to prove to another person that they know something is true, without revealing any information on how they know it is true.
Fully homomorphic encryption
The so-called holy grail of cryptography, in which it is possible to run analytics on encrypted data without decrypting it first.
A way of adding noise to data that preserves privacy.