8000 ROBERT's desired property is not clearly stated, proposing: "k-anonymity" · Issue #52 · ROBERT-proximity-tracing/documents · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ROBERT's desired property is not clearly stated, proposing: "k-anonymity" #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
superboum opened this issue Apr 30, 2020 · 3 comments

Comments

@superboum
Copy link
superboum commented Apr 30, 2020

ROBERT's authors refer often (3 times) to @vaudenay's security analysis of DP-3T in their paper. Indeed, they position their paper as an alternative of "decentralized" applications like DP-3T:

Although it might seem attractive in term of privacy to adopt a fully decentralized solution, such
approaches face important challenges in terms of security and robustness against malicious users [6].

Authors precise their thought later:

Other, qualified as “decentralized”, schemes broadcast to each App an aggregate information containing the pseudonyms of all the infected users [1]. This information allows each App to decode the identifiers of infected users and verify if any of them are part of its contact list. Our scheme does not follow this principle because we believe that sending information about all infected users reveals too much information. In fact, it has been shown that this information can be easily used by malicious users to re-identify infected users at scale [6].

It seems that the main point they identify is exposure deanonymization.
However, defining exactly what attacks could lead to such deanonymization is not that simple.
It has been lengthily discussed in #46 and it appears that without a proper authentication mechanism, ROBERT is vulnerable to the same attacks than DP-3T. At the same time, it has been proven that such deanonymization attacks can't be conducted at scale on DP-3T. Finally, and following inherent problems of tracking applications, if you met only one person during the last 15 days and receive a notification, you will be able to identify the infected person. Please refer to aboutet's message for the authentication part, veale's message for DP-3T mitigations, and risques-tracage.fr for inherent attacks to tracking apps.

We define authenticated ROBERT as a patched version of ROBERT that is not vulnerable to Sybil attacks

It seems that the only goal of authenticated ROBERT is to, contrary to DP-3T, prevent a notified user to learn who may have infected her considering she met more than one app user in the last 15 days.

It looks like we could rephrase this desired property as "k-exposure-anonymity" (exposure referring to the fact that we were in contact long and close enough to be recorded by the app).
k would be the degree of anonymity that has an infected person A against another user of the app named B.
k will be equal to the number of people that B has seen during the last 15 days.
A has probably met many people during the last 15 days, so we name the people she was exposed to B_i (B_0, B_1, B_2, etc.).
Each B_i app user met a different number of users during these last 15 days.
It will mean that A will have a different anonymity degree against each user B_i.
Anonymity degree between A and B_i is referred to as k_i.
If one user B_j has seen only one app user in the last 15 days, A will be 1-contact-anonymous to B_j - so no anonymity is provided to A against B_j.

Introducing this k-exposure-anonymity logic could help us improve our exchange and evaluation of ROBERT:

  • At which value of k-exposure-anonymity we consider that infected user privacy is protected? 2? 5? 100?
  • Should we prevent from notifying users B_i that have met less person than the desired k value to protect infected persons' privacy?
  • Do the average user met enough people in a 15 days timeframe to justify k-exposure-anonymity?
  • How can we model a more accurate k value considering user's environment (for example if I live with 5 people, I will provide in theory at least k=5 anonymity to infected persons but the real k' value is k'=k-5)

To conclude, I think that ROBERT tries to provide a property that I name "k-exposure-anonymity" without explicitly defining and naming it. Authors should clearly state the property they seek and critically evaluate their proposal at the light of it

Note: This post is not about the usefulness of this property or about tradeoff it involves (like requiring authentication and trust in an authority), however an analysis of cost/benefit of introducing such a property would be very interesting

@superboum superboum changed the title ROBERT's desired property is not clearly stated, proposing: "k-exposure-anonymity" ROBERT's desired property is not clearly stated, proposing: "k-anonymity" May 1, 2020
@vincent-grenoble
Copy link
Contributor

Hello @superboum. I note that, as you say:

This post is not about the usefulness of this property or about tradeoff it involves [...]

I agree with the way you formulate the problem. Yet I cannot avoid thinking about the limits of manual tracing: an elderly (B) who sees only the person who brings her daily meals (A). Here A will be in 1-exposure-anonymity to B. So what? You ask in particular:

  • Should we prevent from notifying users B_i that have met less person than the desired k value to protect infected persons' privacy?

Manual tracing considers that the benefits are superior to the risks, and de-anonymisation will no be considered as a fundamental obstacle. Why should this very particular situation be considered differently when it is an app?

Once again we are back to the “honest but curious” assumption. As we already said in #2:

[...] Concerning the “honest but curious” assumption: this is a key assumption for the ROBERT v1.0 design as you noticed. It is not our responsibility, as privacy researchers, to judge whether or not this assumption is valid.
This topic could be discussed for hours, clearly. However, when looking at the “avis CNIL sur le projet d’application mobile StopCovid”, we have the feeling this is a reasonable assumption.

@superboum
Copy link
Author
superboum commented May 4, 2020

I propose k-anonymity as it seems to capture the property you want and it could help to leverage on the vast literature on the subject. It captures at the same time a vast majority of attacks and simplify the problem.

About "honest but curious"

I am speaking about end users here. Does it mean you assume "honest but curious" end users too? If so, why not simply use the API provided by phone manufacturers like Trusty, more generally TPM/TEE/etc. ?

But hardware security mechanisms may not be needed at all. Users do not have root access on their phone, so if the collected list is managed by the system and not an app, it is possible to expose an API on the phone that do not expose ephemeral IDs. It seems that its what Apple+Google have chosen to do. source

So if you choose a weak attack model, then it seems the problem is already solved. And if you choose a stronger attack model, it seems many things are still blurry.

@IcyApril
Copy link

Hi;

I've previously done a lot of work on hash-based k-Anonymous search; I created the k-Anonymous search approach used in Have I Been Pwned, which I later worked with Cornell University to provide formal analysis for the protocols refine into new C3S protocols [ACM]. This work fed into the efforts to create Google Password Checkup by Google and Stanford [Usenix]

Before the pandemic, I was doing work on anonymising wireless unique identifiers (for example, in Bluetooth Journey Time Monitoring Systems). This work provides formal analysis and experimental data for applying k-Anonymity to hashes for the purpose of anonymisation. The pre-print of the paper is here (conference accepted): https://arxiv.org/abs/2005.06580

Recently, I've been working on using k-Anonymity to prevent de-anonymisation attacks in existing Contact Tracing protocols. I have formed a hypothesis for using a cross-hashing approach to provide a cryptographic guarantee for the minimum contact time and additionally prevent "Deanonymizing Known Reported Users". This uses a k-Anonymous search approach to reduce the communication overhead and additionally offer additional protection to data leaks from the server (using Private Set Intersection). The hypothesis can be found here alongside discussions of the risk factors - but do note there are no experimental results at this stage and this paper is not peer-reviewed: https://arxiv.org/abs/2005.12884

If anyone has any feedback on this work, please do reach out to me (my email is on the papers).

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0