Leakage under a Single Bloom Filter

In the sequel, we assume that SPV clients use anonymizing networks when connecting to regular Bitcoin nodes. As mentioned earlier, this alleviates potential leakage in the network layer. We additionally assume that SPV clients do not insert the public key and the corresponding Bitcoin address into the same filter in order to prevent trivial leakages of their embedded addresses (see Section 6.2.4). We show that even with these measures, Bloom filters still leak considerable information about the embedded client addresses.

Namely, in existing SPV clients, a node initializes its Bloom filter Bi with a random nonce r and specifies its target false positive rate Pt that can be achieved when a number of elements M have been inserted in the filter. By default, M is set to m + 100 = 2N + 100, where N is the number of Bitcoin addresses inserted in Bi. Here, the additional number 100 was originally added to m by the Bitcoin developers in order to avoid the recomputation of a new filter in the case where a user inserts up to 50 additional Bitcoin addresses (recall that a Bitcoin address is inserted into the Bloom filter by adding both the corresponding public key and the public key hash to the filter; therefore m = 2N).

The default target false positive rate of the Bloom filter Pt is set to 0.05% at the time of writing. The size of the filter n and the number of hashes k in Bloom filters are computed as follows:

Note that if the SPV client restarts (e.g., mobile phone reboots, mobile application is restarted), then the Bloom filter will be recomputed (the SPV client stores the Bloom filter in volatile memory). When the user acquires 50 or more additional addresses such that m > M, then the SPV client will resize the Bloom filter by recomputing M = 2N + 100, and will send the updated Bloom filter to the full Bitcoin nodes that it is connected to.

Note that given n and k, the number of elements contained in a Bloom filter can be estimated by the adversary as follows [9]:

Here, X corresponds to the number of bits of the Bloom filter set to one. Given n and Pt, M can also be computed by the adversary from (6.2).

Note that, in April 2014, the Bitcoin blockchain comprised nearly |B| =33 million addresses. This means that an adversary can simply try all possible addresses in the Bitcoin system in order to compute the positives of the Bloom filter Bi, denoted in the sequel by Bi.

Following from [4], we quantify the privacy offered by a Bloom filter using the probability, Ph(j), that the adversary correctly guesses any j true positives of a Bloom filter among all positives that match the filter and which are not included in the knowledge of the adversary.2 More specifically, we measure Ph(j) achieved by

a Bloom filter Bb as follows: Ph(j) = ПС0 NN+s-k = NNs • nN S-i .... Here N refers to the number of Bitcoin addresses inserted into Bi and S denotes the

cardinality of the set {Fi - K}; S therefore corresponds to all false positives that match Bi, but for which the adversary does not have any knowledge about.

2 Clearly, the higher is Ph( ), the smaller is the privacy of the SPV node.

It is straightforward to see that:

Here, N C |B|, and m = 2N is the number of elements contained in the Bloom filter seen by the adversary.

Note that if the adversary is able to identify an SPV client (e.g., by some side channel information), then simply identifying any address pertaining to that client would be a considerable violation of its privacy. Otherwise, if the adversary can link a number of addresses to the same anonymous client, then the information offered by the clustering of these addresses offers the adversary considerable information about the profile of the client, such as its purchasing habits, and so on.

It is worthy to note that Ph(} may not always capture the probability of guessing addresses that belong to the user of the SPV client, for example, in the case where the SPV client may embed addresses that do belong to the user in its Bloom filter.

 
Source
< Prev   CONTENTS   Source   Next >