Monster Scale Summit
Curious how leading engineers tackle extreme scale challenges with data-intensive applications? Join Monster Scale Summit (free + virtual). It’s hosted by ScyllaDB, the monstrously fast and scalable database.
Agenda
Introduction
Welcome to week 1 of Build Your Own Key-Value Storage Engine!
Let’s start by making sure what you’re about to build in this series makes complete sense: what’s a storage engine?
A storage engine is the part of a database that actually stores, indexes, and retrieves data, whether on disk or in memory. Think of the database as the restaurant, and the storage engine as the kitchen that decides how food is prepared and stored.
Some databases let you choose the storage engine. For example, MySQL uses InnoDB by default (based on B+-trees). Through plugins, you can switch to RocksDB, which is based on LSM trees.
This week, you will build an in-memory storage engine and the first version of the validation client that you will reuse throughout the series.
Your Tasks
💬 If you want to share your progress, discuss solutions, or collaborate with other coders, join the community Discord server (#kv-store-engine channel):
Assumptions
Keys are lowercase ASCII strings.
Values are ASCII strings.
NOTE: Assumptions persist for the rest of the series unless explicitly discarded.
REST Endpoints to Implement
PUT /{key}:The request body contains the value.
If the key exists, update its value and return success.
If the key doesn’t exist, create it and return success.
Keep all data in memory.
GET /{key}:If the key exists, return 200 OK with the value in the body.
If the key does not exist, return
404 Not Found.
Client & Validation
Implement a client to validate your server:
Read the testing scenario from this file: put.txt.
Run an HTTP request for each line:
PUT k v→ Send aPUTto/kwith bodyv.GET k v→ Send aGETto/k. Confirm thatvis returned. If not, something is wrong with your implementation.GET k NOT_FOUND→ Send a GET to/k. Confirm that404 Not Foundis returned. If not, something is wrong with your implementation.
Each request must be executed sequentially, one line at a time; otherwise, out-of-order responses may fail the client’s assertions.
Input File Generation
If you want to generate an input file with a different number of lines, you can use this Go generator:
go run gen.go <format> <lines><format>is the format to generate.<lines>is the number of lines.
At this stage, you need a put-type file, so for example, if you need one million lines:
go run gen.go put 1000000[Optional] Metrics
Add basic metrics for latency:
Record start and end time for each request.
Keep a small histogram of latencies in milliseconds.
At the end, print
p50,p95, andp99.
This work is optional as there is no latency target in this series. However, it can be an interesting point of comparison across weeks to see how your changes affect latency.
Wrap Up
That’s it for this week! You have built a simple storage engine that keeps everything in memory.
In two weeks, we will level up. You will delve into a data structure widely used in key-value databases: LSM trees.
❤️ If you enjoyed this post, please hit the like button.





