How to Keep Data Private With Google and Apple’s Contact Tracing App
In a rare instance of cooperation, Google and Apple, two pillars of the global tech industry, announced a joint effort to create a COVID-19 exposure tracing application for mobile phones in conjunction with world governments. The app, which is set to be available on both Android and iOS phones, relies on Bluetooth technology to warn against potential exposure to a person infected with COVID-19.
Due to its participating organizations, all three of which have shoddy track-records on privacy, the application immediately raised the suspicions of privacy proponents. The Electronic Frontier Foundation, a staunch supporter of digital privacy, posed questions to developers and implored them to question the cybersecurity and privacy implications of the joint contact-tracing app.
Google and Apple responded by tinkering with the tracing keys and encryption of the application to improve privacy, but questions still linger.
In particular, many of the benefits of the contact-tracing app are stifled by simple logistical problems like Bluetooth not being designed for contact-tracing (can’t delineate between disease and distance); many people don’t carry Bluetooth-compatible cell phones; and most people won’t voluntarily download the app.
If we weigh the potential benefits vs. the privacy threat, is the app really worth it? Probably not, but a tokenized version would be much more palatable. Let’s explore why.
The feint of privacy
The Google and Apple tracing application relies on rolling proximity identifiers, or RPIDs, that are used to ping other Bluetooth devices. RPIDs are changed every few minutes, and users who believe they are infected can share their previous RPIDs with a public registry that verifies whether the user is infected, and subsequently, alerts any recently connected “pings” to that user’s device.
Apple and Google, admittedly, made an effort to reduce centralization by allocating most of the encryption keys to users’ devices rather than central servers, but some key problems persist. For example, as detailed by the EFF:
“A well-resourced adversary could collect RPIDs from many different places at once by setting up static Bluetooth beacons in public places, or by convincing thousands of users to install an app. […] But once a user uploads their daily diagnosis keys to the public registry, the tracker can use them to link together all of that person’s RPIDs from a single day.”
Consequently, the hacker could map out every movement of a user’s life, trivially determining who that person it. It’s the equivalent of having a real-time lens into a person’s daily movements. The EFF goes on to elaborate that the problem is not explicitly limited to Bluetooth but that Bluetooth is largely unsecured, and its attack surface needs to be reduced to a minimum.
Additionally, the government and police could have direct access to proximity tracking metrics for users, extracting pertinent information about their whereabouts and activities, should they choose. None of these concerns even take into account the security of the public registry or data leaked to Apple and Google’s servers.
We can boil down the inherent problem of the Google and Apple tracking system to trust.
Trust in the government and tech companies not to abuse the data, trust that the RPIDs uploaded to the registry are not spam (they have no way of authenticating real uploads form individuals), and trust that third-party developers won’t wield the system for surveillance purposes.
The entire system is based on trust, and what are decentralized cryptocurrency tokens good for? Rapid verification — e.g., auditability — and trust-minimization.
The advantages of tokenization
First, it’s hard to ignore that Apple and Google could’ve turned to open-source cryptography and its accompanying class of willing privacy-oriented startups and activists right out of the gate. People would feel much more comfortable. But they didn’t — no surprise.
Much of the terminology used by the two companies has also been nebulous. There are also concerns about aspects of the application that these companies would have direct control over, such as turning off notifications and proximity tracking, even after the crisis is over.
Such powers should be entirely removed from the hands of these centralized, profit-oriented corporate entities. An ideal way to do that would be tokenized and encrypted verification of infected proximity RPIDs.
For example, using customized parameters for the proximity tracking could be baked into each token. Tokens are not under the development auspices of any single entity, and the tokens can be burned by the token users once the token’s utility is finished. There’s no umbrella switch under the control of a company that keeps the application running — it’s entirely decentralized and retains permissionless access.
Each specific user would have a token allocated to them, with RPIDs encrypted and managed solely on that user’s device. If users believe they are COVID-19 positive, they can send an attestation to the public registry. An accredited clinic or hospital can, therefore, issue a certificate denoting a positive diagnosis for those users. As no public identifiable data needs to be submitted, the burdensome process of government service is replaced by much quicker technology.
From there, the actual location data of the individual can be baked within the token while the pertinent details — e.g., de-identified COVID-19 swap — for authenticating the status of infection could be released. TokenScript acts as the point of communication between services that need the data and the actual data that never leaves the mobile phone. This severs the propensity of both governments and base third-party developers to wield location data for unethical means.
Relevant details for confirming diagnoses, not paired with location data, could be sent to third-party organizations, like the World Health Organization, without fears of them abusing privacy. In practice, this can be done by the patient visiting the WHO website, which will request either multi-party computation or a zero-knowledge proof of relevant data. The security enclave in TokenScript ensures that the website does not learn the original data but only the computational results. The entire medical industry should rely on de-identified patient data to guard against the ethical violations of pharmaceutical companies. This solution we have proposed also anonymizes patient data, just locally on the user’s cell phone, without assuming the organization to be honest and secure. However, it would be censor-resistant and quicker — so quick that the website can instantly update its statistics and reporting as users use their tokens to participate in the computation of new reports on the website.
For example, a person named Michael wishes to know if he has ever crossed paths with a COVID-19-positive person. He could initiate a round of multi-party computations that identifies other users of the app who have been identified as positive. The mobile devices of those who have been identified as COVID-19 positive could participate in MPC, thereby helping Michael to learn if he has been in contact with them without letting exposing sensitive information to that person, such as when and where the possible transmission occurred. The larger the size of both groups (normal users and identified positive cases), the higher the level of privacy will be. With some future advancements in cryptography, we can even look forward to the day when this can be done without the patient’s mobile phone being online to participate in the computation and merely through obfuscated data submitted to a public registry.
Many of the problems that flow from proximity-based applications are privacy-oriented. And while tokens today do not provide perfect privacy without being exceptionally cumbersome, there is room for improvement. There’s no precedent for a wide-scale surveillance app like the one being built, and the nebulous descriptions about some of its characteristics is concerning. Probably aware of the backlash they would receive, Apple and Google have made an effort to distribute much of the data, but the security and privacy leaks are plentiful — primarily via the transfer of authentication between the public registry and individual.
A tokenized version of the authentication certificate, RPID cross-referencing, and use across multiple systems would be a better option for verification at scale without sacrificing privacy or control over the data to a third-party. As data is computed locally, an advisory has little to gain by going after Google or Apple. It may not be a panacea, but exploring how tokenization works in the case of disease tracing should become a notable area of research and development — lest we forfeit privacy for safety at the whims of governments.
Or, perhaps considering the limitations and uncertainty of Bluetooth, simply avoiding surveillance applications issued by a joint government and big tech initiative altogether is much simpler.
The views, thoughts and opinions expressed here are the author’s alone and do not necessarily reflect or represent the views and opinions of Cointelegraph.