Knowledge Test: Consistent Hashing Explained

Contents

This article contains the questions and solutions related to the article: Consistent hashing explained

Disclaimer: The system design questions are subjective. This article is written based on the research I have done on the topic and might differ from real-world implementations. Feel free to share your feedback and ask questions in the comments.

Consistent hashing is used in distributed systems design such as the URL shortener and Pastebin. I recommend reading the related articles to improve your system design skills.

  1. Why partition the data?
  2. How does a hot spot occur?
  3. What is gossip protocol?
  4. What are tradeoffs of the cache replication?
  5. What is the optimal configuration of spread and the load for the high performance of a cache server?
  6. What are the drawbacks of key range partitioning?
  7. How to locate a node in static hash partitioning?
  8. What are the techniques to resolve a collision in a hashmap?
  9. What is a virtual node?
  10. What data structure is used to store the positions of the nodes on the hash ring?
  11. What is the asymptotic complexity to add a node in consistent hashing?
  12. What hash functions are used in consistent hashing?
  13. What are the benefits of consistent hashing?
  14. What are the drawbacks of consistent hashing?
  15. What are the consistent hashing examples?
  16. What are some popular variants of consistent hashing?
  1. Data partitioning helps to parallelize task execution, and isolate data entities resulting in improved performance and scalability of the system
  2. The large share of data and a high volume of requests to a node results in a hot spot
  3. A peer-to-peer communication technique used by nodes to periodically exchange state information
  4. Only a limited data set is cached and consistency between cache replicas is expensive to maintain
  5. The optimal configuration for the high performance of a cache server is to keep the spread and the load at a minimum
  6. The data set is not necessarily uniformly distributed among the cache servers as there might be more keys in a certain key range
  7. Node ID = hash(key) mod N, where N is the array’s length and the key is the key of the data object
  8. Open addressing and Chaining
  9. The technique of assigning multiple positions to a node is known as a virtual node
  10. The self-balancing binary search tree data structure is used to store the positions of the nodes on the hash ring
  11. O(k/n + logn), where O(k/n) for redistribution of keys and O(logn) for binary search tree traversal
  12. MD5, SHA-1, SHA-256 ,MurmurHash, xxHash, MetroHash, or SipHash1–3 are used in consistent hashing
  13. Horizontally scalable and minimized data movement when the number of nodes changes
  14. Cascading failure due to hot spots and non-uniform distribution of nodes and data
  15. The discord server (discord space or chat room) is hosted on a set of nodes. The client of the discord chat application identifies the set of nodes that hosts a specific discord server using consistent hashing
  16. Multi-probe consistent hashing and Consistent hashing with bounded loads



Get the powerful template to approach system design for FREE on newsletter sign-up:



CC BY-NC-ND 4.0: This license allows reusers to copy and distribute the content in this article in any medium or format in unadapted form only, for noncommercial purposes, and only so long as attribution is given to the creator. The original article must be backlinked.