DisLog

Eric Liu
2 min readDec 30, 2020

This article is for the purpose of logging my progress developing “DisStore”. DisStore is a distributed key-value storage server mimicking some structural designs of the Amazon S3 storage service. The goal of this system is to be able to store key-value pairs from clients with basic CRUD functionalities implemented in a distributed fashion. Qualitatively, the project aims to be reliable, available, and efficient. The system consists of components coarsely categorized as API layer, Partition layer, and Streaming Layer.

API Layer

The API layer consists of the API server and a write-back cache. The API server handles requests from potentially multiple clients concurrently to create, read, update, or delete key-value pairs. If the pair is in the cache, the API server operates with the pair in the cache; else the API server sends the requests to the partition manager on the partition layer. Upon a read request, the API server also handles the response from partition servers and forwards the response to the client.

Partition Layer

The Partition layer consists of partition server nodes and a partition manager. Upon receiving a request from the API server, the partition manager hashes the keys with a consistent hash ring design and forwards the request to the appropriate partition server. The partition server nodes pretend to be the end distributed storage servers for this layer as they serve the requests from the API server. Each one of the partition servers is connected to a stream from the stream layer. The partition server forwards the requests to their assigned stream, which then handles the request. The assignment between partition servers and streams are stored in the partition map table. The partition manager refers to this table to assign streams to partition servers.

Streaming Layer

The streaming layer consists of streams and a stream manager. A stream mimics a data center located in an area. Each time a data center is full, a new disk or even a new machine is added to enlarge the capacity of the particular stream. To mimic this behavior but also maintaining the ability to simulate this design on a single machine, each stream is connected to a separate database implemented with MySQL. Each disk in the datacenter is mimicked as a table in the database. The tables are limited to have a small size so that they fill up easily for simulation purposes. Each time a table is full, the stream sends a request to the stream manager and ask for more table. The stream manager, mimicking the data operation manager, allocates a new table if there are free tables available. Otherwise, the stream manager allocates a new stream for that partition server to connect to. This reassignment of partition is updated in the partition map table and the update also notifies the partition manager to reassign the partition server to the new stream.

Work Log

--

--