Skip to main content

Consistent hashing

·8 mins

Why do we need hashing in distributed systems? #

Distributed systems, by definition, need to manage data and workloads across multiple machines/nodes.

A distributed system may need to decide which node should is responsible for storing or holding a particular piece of data, or which node should handle a specific request.

Plus, as the system scales and adds or removes nodes, it needs a way to make or re-compute those decisions so that as a whole, the system operates properly.

Think of this as a mapping process where based on some input criteria, the output is a node.

Hashing is a common method to achieve this mapping.

What does hash-based mapping look like? #

Hash functions transform input data of arbitrary size into fixed-size values (hash codes). In distributed systems, these hashes are used to pick a node (e.g. where data should live or which one should handle a request).

Suppose there are $N$ nodes and hash function $h$. Then, to find the index of the node to use given some input $x$:

$$i = h(x) \bmod N$$

This approach has several beneficial properties:

  1. Deterministic: The same key always maps to the same server (assuming no changes to the cluster)
  2. Uniform distribution: A good hash function spreads keys evenly across all servers
  3. Computation efficiency: Calculating the mapping is fast and doesn’t require communication between nodes
  4. No central coordination: Each node can independently determine where any key should go

For example, if we have a caching system with 5 servers and need to store a cache entry with key “user:42”:

We plug the values into our equation, $i = h(\text{“user:42”}) \bmod 5$.

Suppose $h(\text{“user:42”}) = 923487$.

Then $i = 923487 \bmod 5 = 2$.

So we would store the cache entry on server 2.

The limitations of naive hashing #

Let’s suppose we have a 5 server system once again.

What happens when we add a new server to our cluster? Let’s say we expand from 5 to 6 servers.

The mapping equation changes from $i = h(key) \bmod 5$ into $i = h(key) \bmod 6$.

Keys will shift nodes, like in our example:

  • Before: $923487 \bmod 5 = 2$, so store on server 2.
  • After: $923487 \bmod 6 = 3$, now store on server 3 instead.

In fact, many keys could be shifted around the servers. This causes several serious problems. In a caching system, this means most requests suddenly miss the cache or requests may be unbalanced. Not to mention just transferring data among nodes is expensive as well.

Introducing consistent hashing #

Consistent hashing solves our redistribution problem by changing how we think about the hash space. Instead of mapping directly to server indices, we map into a circular space.

Here’s how it works:

  1. Imagine our hash function $h$ outputs values in a range, such as $[0, 2^{32}-1]$
  2. We arrange this range into a circle (a “hash ring”)
  3. Each server is mapped to one or more positions on this ring using the hash function
  4. To find which server handles a key, we hash the key and find the key’s position on the ring, then move clockwise until we land on a server

Imagine our hash space is $[0, 359]$ and we have three servers at the following positions:

  • $h_\text{s1} = 30$
  • $h_\text{s2} = 150$
  • $h_\text{s3} = 270$

Server 1 is responsible for keys in 271 to 30.

Server 2 is responsible for keys in 31 to 150.

Server 3 is responsible for keys in 151 to 270.

The diagram shows our hash ring with servers positioned at 30, 150, and 270. Each server is responsible for the range clockwise from the previous server up to its own position:

graph LR s1(("Server 1 (keys at 271 to 30)")) s2(("Server 2 (keys at 31 to 150)")) s3(("Server 3 (keys at 151 to 270)")) s1 --> s2 s2 --> s3 s3 --> s1 %% Style nodes by server classDef server1 fill:#f96,stroke:#333 classDef server2 fill:#9cf,stroke:#333 classDef server3 fill:#9f9,stroke:#333 %% Apply styles to nodes class s1 server1 class s2 server2 class s3 server3

If we have a key with $h(\text{key}) = 100$, we find its server by moving clockwise from position 100 until we find the next server, which is Server 2 at 150.

Therefore, Server 2 is responsible for handling the key at position 100.

What happens when servers change? #

The magic of consistent hashing becomes apparent when we add or remove servers.

Adding a server #

Let’s add Server 4 at position 210:

graph LR s1(("Server 1 (keys at 271 to 30)")) s2(("Server 2 (keys at 31 to 150)")) s4(("Server 4 (keys at 151 to 210)")) s3(("Server 3 (keys at 211 to 270)")) s1 --> s2 s2 --> s4 s4 --> s3 s3 --> s1 %% Style nodes by server classDef server1 fill:#f96,stroke:#333 classDef server2 fill:#9cf,stroke:#333 classDef server3 fill:#9f9,stroke:#333 classDef server4 fill:#c9f,stroke:#333,stroke-width:2px %% Apply styles to nodes class s1 server1 class s2 server2 class s3 server3 class s4 server4

Notice our key at position 100 is still assigned to Server 2. In fact, only a subset of keys belonging to Server 3 will be reassigned - a much smaller fraction than before!

Removing a server #

Now suppose we have our 4 server system.

If Server 2 fails, only the keys that were assigned to Server 2 need to be reassigned to Server 3. The rest of the keys maintain their original assignments.

graph LR s1(("Server 1 (keys at 271° to 30°)")) s4(("Server 4 (keys at 31° to 210°)")) s3(("Server 3 (keys at 211° to 270°)")) s1 --> s4 s4 --> s3 s3 --> s1 %% Style nodes by server classDef server1 fill:#f96,stroke:#333 classDef server3 fill:#9f9,stroke:#333 classDef server4 fill:#c9f,stroke:#333 %% Apply styles to nodes class s1 server1 class s3 server3 class s4 server4

Virtual nodes enter the picture #

A remaining issue with our basic consistent hashing implementation is that the distribution might not be perfectly balanced. With only a few servers, positioning on the ring becomes critical, and some servers may end up with a disproportionate share of the key space.

To address this problem, we can use virtual nodes. Instead of mapping each physical server to a single point on the hash ring, we map each server to multiple points:

  1. For each physical server, create multiple virtual nodes
  2. Each virtual nodes gets its own position on the hash ring
  3. Keys are assigned to virtual nodes using the same clockwise assignment rule
  4. The physical server associated with a virtual nodes handles the key

For example, instead of placing “Server 1” at a single position, we could create virtual nodes like “Server1-vnode0”, “Server1-vnode1”, “Server1-vnode2”, etc., and place them all around the ring.

Here’s a simplified example with 3 servers, each having 3 virtual nodes:

graph LR %% Define nodes s1a((S1-A)) --> s1b((S1-B)) s1b --> s1c((S1-C)) s1c --> s2a((S2-A)) s2a --> s2b((S2-B)) s2b --> s2c((S2-C)) s2c --> s3a((S3-A)) s3a --> s3b((S3-B)) s3b --> s3c((S3-C)) s3c --> s1a %% Style nodes by server classDef server1 fill:#f96,stroke:#333 classDef server2 fill:#9cf,stroke:#333 classDef server3 fill:#9f9,stroke:#333 %% Apply styles to nodes class s1a,s1b,s1c server1 class s2a,s2b,s2c server2 class s3a,s3b,s3c server3

Let’s suppose we have the following:

  • $S$ = the set of physical servers in our system
  • $s$ = a specific server where $s \in S$
  • $V$ = number of virtual nodes per physical server
  • $v$ = virtual node index, where $0 \leq v < V$
  • $h$ = our hash function
  • $M$ = size of the hash ring
  • $\text{id}_s$ = unique identifier for server $s$

For each server $s$ and each virtual node index $v$, we calculate the position on the hash ring as:

$$\text{position}(s, v) = h(\text{id}_s + v) \bmod M$$

When a key $k$ needs to be mapped, we:

  1. Calculate its position: $\text{pos}_k = h(k) \bmod M$
  2. Find the virtual node with the next position clockwise from $\text{pos}_k$
  3. Assign the key to the physical server that owns that virtual node

Having virtual nodes helps distribute the load more evenly across physical servers.

Moving keys #

Let’s calculate how many keys need to be redistributed when adding or removing a server in a system using virtual nodes:

  • $K$ = total number of keys in the system
  • $N$ = number of physical servers before the change
  • $V$ = number of virtual nodes per physical server
  • $M$ = size of the hash ring

The key redistribution is an $O(\frac{K}{N})$ operation. Here’s why that’s the case.

When adding a new server #

The new server will introduce $V$ new virtual nodes on the hash ring. Each virtual node will take responsibility for a portion of the hash ring.

Let’s derive how many keys need to be moved.

With $N$ servers before the addition, each with $V$ virtual nodes, we have a total of $N \cdot V$ virtual nodes distributed around the hash ring. Assuming a good hash function, each virtual node handles approximately $\frac{K}{N \cdot V}$ keys.

When we add a new server with $V$ virtual nodes, the total becomes $(N+1) \cdot V$ virtual nodes. Since the new server’s virtual nodes are randomly distributed on the ring, they will take responsibility for approximately $\frac{V}{(N+1) \cdot V} = \frac{1}{N+1}$ of the hash space.

Therefore, the expected number of keys that need to be moved is:

$$\frac{K}{N+1}$$

When removing a server #

When a server with $V$ virtual nodes is removed, the keys it was responsible for must be redistributed to the remaining servers.

With $N$ servers, each server handles approximately $\frac{1}{N}$ of the keys due to its $V$ virtual nodes being distributed around the ring. When one server fails or is removed, all the keys it was responsible for (approximately $\frac{K}{N}$) need to be redistributed among the remaining servers.

Therefore, the expected number of keys that need to be moved is:

$$\frac{K}{N}$$