Monday, February 17, 2014

HashSet vs HashMap

Today, one of my developers was using a HashMap to store a white-list of IP addresses. He was storing the IP address as both the key and value.
I told him about the HashSet class that can also be used to store a white-list of elements and similar to the HashMap; a HashSet also provides a time-complexity of 1 for get()/contains() operations.

We were a bit intrigued on how a HashSet actually works; so we looked into the source code. Much to our surprise, a HashSet actually uses a HashMap internally. For every item we add to the set, it adds it as a key to the HashMap and adds a dummy object as a value, as given below.

// Dummy value to associate with an Object in the backing Map
96     private static final Object PRESENT = new Object();

In theory, a Set and Map are separate concepts. A set cannot contain duplicates and is unordered, whereas a Map is a key/value pair collection.

Besides the HashSet, you also have 2 more classes or Set implementations that are useful. The first being TreeSet that sorts the elements as you store them in the collection. Second is the LinkedHashSet that maintains the insertion order.