Hash Set

The HashSet class provides a set implementation that is backed by a hashtable data structure (implemented as a HashMap instance, discussed later, which provides a quick way to determine if an element has already been stored in this structure). Although this class provides no ordering guarantees for its elements, HashSet is much faster than TreeSet. Furthermore, HashSet permits the null reference to be stored in its instances.

NOTE: Check out Wikipedia's "Hash table" entry

(http://en.wikipedia.org/wiki/Hash_table) to learn about hashtables.

HashSet supplies four constructors:

■ HashSet() creates a new, empty hashset where the backing HashMap instance has an initial capacity of 16 and a load factor of 0.75. You will learn what these items mean when I discuss HashMap later in this chapter.

■ HashSet(Collection<? extends E> collection) creates a new hashset containing collection's elements. The backing HashMap has an initial capacity sufficient to contain collection's elements and a load factor of 0.75. This constructor throws NullPointerException when collection contains the null reference.

■ HashSet(int initialCapacity) creates a new, empty hashset where the backing HashMap instance has the capacity specified by initialCapacity and a load factor of 0.75. This constructor throws IllegalArgumentException when initialCapacity's value is less than 0.

■ HashSet(int initialCapacity, float loadFactor) creates a new, empty hashset where the backing HashMap instance has the capacity specified by initialCapacity and the load factor specified by loadFactor. This constructor throws IllegalArgumentException when initialCapacity is less than 0 or when loadFactor is less than or equal to 0.

Listing 8-8 demonstrates a hashset.

Listing 8-8. A demonstration of a hashset with String elements unordered import java.util.HashSet;

import java.util.Set;

public class HashSetDemo

public static void main(String[] args) {

Set<String> ss = new HashSet<String>();

String[] fruits = {"apples", "pears", "grapes", "bananas", "kiwis", "pears", null};

for (String fruit: fruits) ss.add(fruit);

static void dump(String title, Set<String> ss) {


In Listing 8-7's TreeSetDemo application, I did not add null to the fruits array because

TreeSet throws NullPointerException when it detects an attempt to add this element. In contrast, HashSet permits null to be added, which is why Listing 8-8 includes null in HashSetDemo's fruits array.

When you run this application, it generates unordered output such as the following:

ss: null grapes bananas kiwis pears apples

Suppose you want to add instances of your classes to a hashset. As with String, your classes must override equals() and hashCode(); otherwise, duplicate class instances can be stored in the hashset. For example, Listing 8-9 presents the source code to an application whose Planet class overrides equals() but fails to also override hashCode().

Listing 8-9. A custom Planet class not overriding hashCode()

import java.util.HashSet; import java.util.Set;

public class CustomClassAndHashSet {

public static void main(String[] args) {

Set<Planet> sp = new HashSet<Planet>(); sp.add(new Planet("Mercury")); sp.add(new Planet("Venus")); sp.add(new Planet("Earth")); sp.add(new Planet("Mars")); sp.add(new Planet("Jupiter")); sp.add(new Planet("Saturn")); sp.add(new Planet("Uranus")); sp.add(new Planet("Neptune")); sp.add(new Planet("Fomalhaut b")); Planet p1 = new Planet("51 Pegasi b"); sp.add(p1);

Planet p2 = new Planet("51 Pegasi b"); sp.add(p2);

System.out.println(p1.equals(p2)); System.out.println(sp);

class Planet {

private String name;

Planet(String name) {

this.name = name;

@Override public boolean equals(Object o) {

if (!(o instanceof Planet))

return false; Planet p = (Planet) o; return p.name.equals(name);

String getName() {

return name;

^Override public String toString() {

return name;

Listing 8-9's Planet class declares a single name field of type String. Although it might seem pointless to declare Planet with a single String field because I could refactor this listing to remove Planet and work with String, I might want to introduce additional fields to Planet (perhaps to store a planet's mass and other characteristics) in the future.

NOTE: equals() relies on a little known fact about the Java language: one instance's private members can be accessed from another instance of the same class. For example, equals() can specify p.name to access p's private name field. Directly accessing an instance's private members in this manner is legal because encapsulation is not violated.

When you run this application, it generates unordered output such as the following:


[Jupiter, Fomalhaut b, Neptune, Uranus, Venus, Earth, Mercury, 51 Pegasi b, Mars,^ Saturn, 51 Pegasi b]

This output reveals two 51 Pegasi b elements in the hashset. Although these elements are equal from the perspective of the overriding equals() method (the first output line, true, proves this point), overriding equals() is not enough to avoid duplicate elements being stored in a hashset: you must also override hashCode().

The easiest way to override hashCode() in Listing 8-9's Planet class is to have the overriding method call the name field's hashCode() method and return its value. (This technique only works with a class whose single reference field's class provides a valid hashCode() method.) Listing 8-10 presents this overriding hashCode() method.

Listing 8-10. Handing over hash code calculation to another hashCode() method

^Override public int hashCode() {

return name.hashCode();

Introduce this method into the previous Planet class and run the application. You will observe output (similar to that shown below) that reveals no duplicate elements:


[Saturn, Earth, Uranus, Fomalhaut b, 51 Pegasi b, Venus, Jupiter, Mercury, Mars,^ Neptune]

NOTE: LinkedHashSet is a subclass of HashSet that uses a linked list to store its elements. As a result, LinkedHashSet's iterator returns elements in the order in which they were inserted. For example, if Listing 8-8 had specified Set<String> ss = new LinkedHashSet<String>();, the application's output would have been ss: apples pears grapes bananas kiwis null. Also, LinkedHashSet offers slower performance than HashSet and faster performance than TreeSet.

Was this article helpful?

0 0

Post a comment