mirror of https://github.com/MISP/misp-website
chg: [blog] fixed
parent
983fd8969c
commit
e2b045d05e
|
@ -52,7 +52,7 @@ Another surprising thing is the very high value of `m` which is `uint64::MAX - 5
|
|||
|
||||
Everything was going fine, until we looked for further optimization to apply.
|
||||
|
||||
Digging into other datastructures (like hash tables) implementations, we noticed a nice optimization can be done on the bit **index generation algorithm**. This optimization consists of taking a bitset (holding all the bits of the filter) with a size being a **power of two**. If we do so, instead of computing bit index with a modulo, we can compute bit index with a `bit and` operation. This comes from the following nice property, for any `m` being a power of two `i % m == i & (m - 1)`. It is important to understand that this particular optimization will allow us to win on CPU cycles as less instructions are needed to compute the bit indexes.
|
||||
Digging into other datastructures (like hash tables) implementations, we noticed a nice optimization can be done on the **bit index generation algorithm**. This optimization consists of taking a bitset (holding all the bits of the filter) with a size being a **power of two**. If we do so, instead of computing bit index with a modulo, we can compute bit index with a `bit and` operation. This comes from the following nice property, for any `m` being a power of two `i % m == i & (m - 1)`. It is important to understand that this particular optimization will allow us to win on CPU cycles as less instructions are needed to compute the bit indexes.
|
||||
|
||||
Trying to apply this with bloom filter is fairly trivial, when computing the desired size (in bits) of the bloom filter we just need to take **the next power of two**. When implementing this with our current Rust port of the [bloom](https://github.com/DCSO/bloom) we realized the implementation was broken. When the bit size of the filter becomes a power of two, the **real false positive proability** of the filter literally explodes (**x20** in some cases). Without, going further into the details we believe the weakness comes from the **bit index generation** algorithm describes above. A [Github issue](https://github.com/DCSO/bloom/issues/19) has been opened in the original project to track this issue.
|
||||
|
||||
|
|
Loading…
Reference in New Issue