Hash tables are important! Note that insertion and find/lookup proceed through identical starting steps: hash the key and go to that table location. This enables the O(1) complexity for both.
Chaining is the common way to handle collisions. Just means each position is actually the head of a linked list.
Open addressing is the other method. Deletions are tricky with this.
Java has built-in hashcodes.
Python's dictionaries are implemented using hash tables. They are simply arrays whose indexes are obtained using a hash function on the keys.
Motivation:
"Think of a hash map as a "hack" on top of an array to let us use flexible keys instead of being stuck with sequential integer "indices."
All we need is a function to convert a key into an array index (an integer). That function is called a hashing function."
For the common case where I increment a key if it exists, but set it to 0 otherwise, use the get method:
my_dict[some_key] = my_dict.get(some_key, 0) + 1
Arrays offer easy random access but hard modification (have to move elements around).
In the worst case a larger array has to be reallocated too (but amortizes out).
In Java:
Use StringBuilder for expensive string concatenation. And for easy string reversal.
Everything is initialized to 0, including arrays.
Remember that chars are also ints.
Python only really has lists; arrays are just thin wrappers over C arrays. Always use lists unless have a really good explicit reason to need C-style arrays.
No comments