Poorly-populated Bloom Filters
Following from (6.5), Ph(1) (the probability of correctly guessing one address as a true positive) is large when 2N/M < 0.4, as long as N < 100. Given a modest number of addresses in the Bloom filter (i.e, N < 100), Pf (m = 2N) is small when m is small. As m increases, Pf (m = 2N) increases (and Ph(1) decreases). For example, when N = 10, Ph = 0.99 which corresponds to the probability of
correctly guessing all the true positives when the SPV client has 10 addresses.
This means that the information leakage in SPV clients is considerable for new Bitcoin users or for Bitcoin users that restart their SPV clients. For instance, at initialization time, the Bloom filter of SPV clients is typically instantiated using M = 102. Moreover, if the user is new in the Bitcoin system and only has 1 Bitcoin address, Ph(1)« 1—which results in complete lack of privacy. Recall that this observation also holds when the SPV client restarts and N < 100.
Gervais et al. analytically compute Ph(j) when the SPV client has 5, 10, 15 and 20 addresses . Their results show that guessing all addresses given one filter that embeds less than 15 addresses can be achieved with almost 0.80 probability. This probability decreases as the number of addresses embedded within the filter increases beyond 15. This analysis has been validated by means of experimentation in the real Bitcoin network by Gervais et al. .
On the other hand, when the Bloom filter comprises a considerable number of elements (i.e., when 2N/M > 0.4), then Pf (m = 2N) is close to Pt.