Woodstock Blog

a tech blog for general algorithmic interview questions

[Fundamental] Recap on Java HashMap

The HashMap

A hash table is made up of two parts: an array (the actual table where the data to be searched is stored) and a mapping function, known as a hash function.

The hash function is a mapping from the input space to the integer space that defines the indices of the array. In other words, the hash function provides a way for assigning numbers to the input data such that the data can then be stored at the array index corresponding to the assigned number.

For example, if I want to store <“Durant”>, I pass “Durant” into the hash function, and get (let’s say) number 3. So in the Hash Table, it will store table(3 -> “Durant”).

In this way, the searching of HashMap can almost achieve O(1) time in best case (like array access).

However,

For average case, It really is (as the wikipedia page says) O(1+n/k) where K is the hash table size. If K is large enough, then the result is effectively O(1). But suppose K is 10 and N is 100. In that case each bucket will have on average 10 entries, so the search time is definitely not O(1); it is a linear search through up to 10 entries.

In practise, we will just assume search in HashMap always O(1).

Below is a great conclusion.

Hash tables are O(1) average and amortized case complexity, however is suffers from O(n) worst case time complexity. [And I think this is where your confusion is]

Hash tables suffer from O(n) worst time complexity due to two reasons:

  1. If too many elements were hashed into the same key: looking inside this key may take O(n) time.
  2. Once a hash table has passed its load balance - it has to rehash [create a new bigger table, and re-insert each element to the table].

However, it is said to be O(1) average and amortized case because:

  1. It is very rare that many items will be hashed to the same key [if you chose a good hash function and you don't have too big load balance.
  2. The rehash operation, which is O(n), can at most happen after n/2 ops, which are all assumed O(1): Thus when you sum the average time per op, you get : (n*O(1) + O(n)) / n) = O(1)