com.planetj.math.rabinhash
Class RabinHashFunction32

java.lang.Object
  extended bycom.planetj.math.rabinhash.RabinHashFunction32
All Implemented Interfaces:
java.lang.Cloneable, java.io.Serializable

public final class RabinHashFunction32
extends java.lang.Object
implements java.io.Serializable, java.lang.Cloneable

This class provides an implementation of a hash function based on Rabin fingerprints, one which can efficiently produce a 32-bit hash value for a sequence of bytes. It does so by considering strings of bytes as large polynomials over GF(2) -- that is, with coefficients of 0 and 1 -- and then reducing them modulo some irreducible polynomial of degree 32. The result is a hash function with very satisfactory properties. In addition the polynomial operations are fast in hardware; even in this Java implementation the speed is reasonable.

Methods in this class can compute a hash value for an array of bytes, chars or ints, as well as any Serializable object, String, file, or resource denoted by URL.

Methods of this class are all thread-safe, and hash function objects are immutable.

Polynomials of degree 32 are used frequently in this code, and are represented efficiently as ints. An int has 32 bits, whereas a polynomial of degree 32 has 33 coefficients. Therefore, the high-order bit of the int is the degree 31 term's coefficient, and the low-order bit is the constant coefficient.

For example the integer 0x00000803, in binary, is:

00000000 00000000 00001000 00000011

Therefore it correponds to the polynomial:

x32 + x11 + x + 1

The implementation is derived from the paper "Some applications of Rabin's fingerprinting method" by Andrei Broder. See http://server3.pa-x.dec.com/SRC/publications/src-papers.html for a full citation and the paper in PDF format.

Since:
2.0
Version:
2.0
Author:
Sean Owen
See Also:
Serialized Form

Field Summary
static RabinHashFunction32 DEFAULT_HASH_FUNCTION
          Default hash function, provided for convenience.
 
Constructor Summary
RabinHashFunction32(int P)
          Creates a RabinHashFunction32 based on the specified polynomial.
 
Method Summary
 boolean equals(java.lang.Object o)
           
 int getP()
           
 int hash(byte[] A)
          Return the Rabin hash value of an array of bytes.
 int hash(java.nio.ByteBuffer A)
          Returns the Rabin hash value of a ByteBuffer.
 int hash(char[] A)
          Return the Rabin hash value of an array of chars.
 int hash(java.io.File f)
          Computes the Rabin hash value of the contents of a file.
 int hash(java.io.InputStream is)
          Computes the Rabin hash value of the data from an InputStream.
 int hash(int[] A)
          Returns the Rabin hash value of an array of ints.
 int hash(java.nio.IntBuffer A)
          Returns the Rabin hash value of an IntBuffer.
 int hash(java.io.Serializable obj)
          Returns the Rabin hash value of a serializable object.
 int hash(java.lang.String s)
          Computes the Rabin hash value of a String.
 int hash(java.net.URL url)
          Computes the Rabin hash value of the contents of a file, specified by URL.
 int hashCode()
           
 java.lang.String toString()
           
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

DEFAULT_HASH_FUNCTION

public static final RabinHashFunction32 DEFAULT_HASH_FUNCTION
Default hash function, provided for convenience.

Constructor Detail

RabinHashFunction32

public RabinHashFunction32(int P)

Creates a RabinHashFunction32 based on the specified polynomial.

This class does not test the polynomial for irreducibility; therefore this constructor should only be used with polynomials that are already known to be irreducible, or else the hash function will not perform optimally.

Parameters:
P - a degree 32 polynomial over GF(2), represented as an int
Method Detail

getP

public int getP()
Returns:
irreducible polynomial used in this hash function, represented as an int

hash

public int hash(byte[] A)

Return the Rabin hash value of an array of bytes.

Parameters:
A - the array of bytes
Returns:
the hash value
Throws:
java.lang.NullPointerException - if A is null

hash

public int hash(char[] A)

Return the Rabin hash value of an array of chars.

Parameters:
A - the array of chars
Returns:
the hash value
Throws:
java.lang.NullPointerException - if A is null

hash

public int hash(int[] A)

Returns the Rabin hash value of an array of ints. This method is the most efficient of all the hash methods, so it should be used when possible.

Parameters:
A - array of ints
Returns:
the hash value
Throws:
java.lang.NullPointerException - if A is null

hash

public int hash(java.nio.ByteBuffer A)

Returns the Rabin hash value of a ByteBuffer.

Parameters:
A - ByteBuffer
Returns:
the hash value
Throws:
java.lang.NullPointerException - if A is null

hash

public int hash(java.nio.IntBuffer A)

Returns the Rabin hash value of an IntBuffer.

Parameters:
A - IntBuffer
Returns:
the hash value
Throws:
java.lang.NullPointerException - if A is null

hash

public int hash(java.io.Serializable obj)

Returns the Rabin hash value of a serializable object.

Parameters:
obj - the object to be hashed
Returns:
the hash value
Throws:
java.lang.NullPointerException - if obj is null

hash

public int hash(java.lang.String s)

Computes the Rabin hash value of a String.

Parameters:
s - the string to be hashed
Returns:
the hash value
Throws:
java.lang.NullPointerException - if s is null

hash

public int hash(java.io.File f)
         throws java.io.FileNotFoundException,
                java.io.IOException

Computes the Rabin hash value of the contents of a file.

Parameters:
f - the file to be hashed
Returns:
the hash value of the file
Throws:
java.io.FileNotFoundException - if the file cannot be found
java.io.IOException - if an error occurs while reading the file
java.lang.NullPointerException - if f is null

hash

public int hash(java.net.URL url)
         throws java.io.IOException

Computes the Rabin hash value of the contents of a file, specified by URL.

Parameters:
url - the URL of the file to be hashed
Returns:
the hash value of the file
Throws:
java.io.IOException - if an error occurs while reading from the URL
java.lang.NullPointerException - if url is null

hash

public int hash(java.io.InputStream is)
         throws java.io.IOException

Computes the Rabin hash value of the data from an InputStream.

Parameters:
is - the InputStream to hash
Returns:
the hash value of the data from the InputStream
Throws:
java.io.IOException - if an error occurs while reading from the InputStream
java.lang.NullPointerException - if stream is null

equals

public boolean equals(java.lang.Object o)

hashCode

public int hashCode()

toString

public java.lang.String toString()