Been following YouTube to study about Database Hash Join. https://www.youtube.com/watch?v=J0nbgXIarhc
HASH TABLE
Design Decision
#1: Hash Function
→ How to map a large key space into a smaller domain.
→ Trade-off between being fast vs. collision rate.
Design Decision
#2: Hashing Scheme
→ How to handle key collisions after hashing.
→ Trade-off between allocating a large hash table vs.
additional instructions to find/insert keys.
Now I have some idea about the Hash function, but the math part I have no idea.
Hash Scheme is for handling collisions. Collision means that different input keys will have some output hash bits.
The following part is about the Chain Hashing.
Maintain a linked list of buckets for each slot in the hash table.
Resolve collisions by placing all elements with the same hash key into the same bucket.
→ To determine whether an element is present, hash to its bucket and scan for it.
→ Insertions and deletions are generalizations of lookups.
I still don't get it. Stackoverflow seems don't have good answers yet. So I come here, asking....