@InterfaceAudience.Public
 @InterfaceStability.Stable
public class BloomFilter
extends org.apache.hadoop.util.bloom.Filter
The Bloom filter is a data structure that was introduced in 1970 and that has been adopted by the networking research community in the past decade thanks to the bandwidth efficiencies that it offers for the transmission of set membership information between networked hosts. A sender encodes the information into a bit vector, the Bloom filter, that is more compact than a conventional representation. Computation and space costs for construction are linear in the number of elements. The receiver uses the filter to test whether various elements are members of the set. Though the filter will occasionally return a false positive, it will never return a false negative. When creating the filter, the sender can choose its desired point in a trade-off between the false positive rate and the size.
Originally created by European Commission One-Lab Project 034819.
The general behavior of a filter, 
Space/Time Trade-Offs in Hash Coding with Allowable Errors| Constructor and Description | 
|---|
| BloomFilter()Default constructor - use with readFields | 
| BloomFilter(int vectorSize,
           int nbHash,
           int hashType)Constructor | 
| Modifier and Type | Method and Description | 
|---|---|
| void | add(org.apache.hadoop.util.bloom.Key key)Adds a key to this filter. | 
| void | and(org.apache.hadoop.util.bloom.Filter filter)Peforms a logical AND between this filter and a specified filter. | 
| int | getVectorSize() | 
| boolean | membershipTest(org.apache.hadoop.util.bloom.Key key)Determines wether a specified key belongs to this filter. | 
| void | not()Performs a logical NOT on this filter. | 
| void | or(org.apache.hadoop.util.bloom.Filter filter)Peforms a logical OR between this filter and a specified filter. | 
| void | readFields(DataInput in)Deserialize the fields of this object from  in. | 
| String | toString() | 
| void | write(DataOutput out)Serialize the fields of this object to  out. | 
| void | xor(org.apache.hadoop.util.bloom.Filter filter)Peforms a logical XOR between this filter and a specified filter. | 
public BloomFilter()
public BloomFilter(int vectorSize,
                   int nbHash,
                   int hashType)
vectorSize - The vector size of this filter.nbHash - The number of hash function to consider.hashType - type of the hashing function (see
 Hash).public void add(org.apache.hadoop.util.bloom.Key key)
org.apache.hadoop.util.bloom.Filteradd in class org.apache.hadoop.util.bloom.Filterkey - The key to add.public void and(org.apache.hadoop.util.bloom.Filter filter)
org.apache.hadoop.util.bloom.FilterInvariant: The result is assigned to this filter.
and in class org.apache.hadoop.util.bloom.Filterfilter - The filter to AND with.public boolean membershipTest(org.apache.hadoop.util.bloom.Key key)
org.apache.hadoop.util.bloom.FiltermembershipTest in class org.apache.hadoop.util.bloom.Filterkey - The key to test.public void not()
org.apache.hadoop.util.bloom.FilterThe result is assigned to this filter.
not in class org.apache.hadoop.util.bloom.Filterpublic void or(org.apache.hadoop.util.bloom.Filter filter)
org.apache.hadoop.util.bloom.FilterInvariant: The result is assigned to this filter.
or in class org.apache.hadoop.util.bloom.Filterfilter - The filter to OR with.public void xor(org.apache.hadoop.util.bloom.Filter filter)
org.apache.hadoop.util.bloom.FilterInvariant: The result is assigned to this filter.
xor in class org.apache.hadoop.util.bloom.Filterfilter - The filter to XOR with.public int getVectorSize()
public void write(DataOutput out) throws IOException
Writableout.write in interface Writablewrite in class org.apache.hadoop.util.bloom.Filterout - DataOuput to serialize this object into.IOException - any other problem for write.public void readFields(DataInput in) throws IOException
Writablein.  
 
 For efficiency, implementations should attempt to re-use storage in the existing object where possible.
readFields in interface WritablereadFields in class org.apache.hadoop.util.bloom.Filterin - DataInput to deseriablize this object from.IOException - any other problem for readFields.Copyright © 2023 Apache Software Foundation. All rights reserved.