Do bloom filters protect against a persistent eavesdropper?

As far as I understand it, bloom filters can be defined with some target hashes and false-positive rate. When filtering the blockchain, all of the target transactions are guaranteed to be found, but many false-positives will also be returned to the client by design.

Suppose that an attacker is able to monitor many bloom filter specifications sent by an SPV client to its peers. Will he be able to eventually eliminate all false positives by performing subset matching on all transactions matching the filters?

As an example, let the set of all transactions be the integers from 1 to 100, and suppose that a client is interested in the transactions 4, 8 and 15. The first time he connects to the network, he transmits a bloom filter matching (1, 4, 7, 8, 15, 27, 44, 73); the second time, his filter matches (3, 4, 6, 8, 15, 27, 66). An attacker would immediately be able to narrow the possible transactions down to (4, 8, 15, 27); after several more invocations, he would then find the correct answer.

Am I misunderstanding how bloom filters work (e.g. the false positives do not change between connections), or is this a theoretical or even practical concern?

lxgr

Posted 2014-04-27T18:37:54.837

Reputation: 281

Answers

See this paper for a good discussion:

On the Privacy Provisions of Bloom Filters in Lightweight Bitcoin Clients

Albert s

Posted 2014-04-27T18:37:54.837

Reputation: 1 344

Yes, you are misunderstanding how vanilla/canonical bloom filters work. For the standard definition, false positives do not change between connections - the set returned on a second request, if different, will be a superset of the first set.

Also, vanilla bloom filters are not used as solutions to security/eaves dropping problems - they're a computation vs. disk access tradeoff optimization (I think of them as a subset of caching algorithms). You might use a local bloom filter to check whether a given url is malicious, to avoid making an expensive network call/lookup on each url access, and only make the network call when you get a positive response.

That said, there are techniques to make bloom filter like structures, which are more robust/have stronger security-characteristics. I don't believe any of them have been applied to bitcoin implementations.

blueberryfields

Posted 2014-04-27T18:37:54.837

Reputation: 111

1The bitcoin specification of bloom filters seems to contain a random input to the hash functions (ntweak) - doed this also not influence the false positives? – lxgr – 2014-04-27T21:52:36.257

@lxgr quickly looking through the code, it seems to me that the ntweak value is meant to help better distribute the use of hash functions, and ultimately reduce the number of false positives you get (even further reducing the security value that the bloom filter might provide, but increasing it's effectiveness as a cache) - you're right, the implementation is non-canonical in the sense of being likely to return random false positives on successive calls. – blueberryfields – 2014-04-28T05:24:08.147