How Discord rebuilt message storage before MongoDB's single replica set hit its limit
At 100 million stored messages, a single MongoDB replica set stopped fitting in memory — and a year later, a near-empty channel almost took the new database down too.
DISCORD·2017·DATABASES / SCALE
System stress over timeBreach at T+7
120M+Messages/day at writing
12 nodesCluster size
<1msWrite latency
<5msRead latency
BASELINE
A Database Built for Speed, Not Scale
Discord shipped its first version in under two months in early 2015. For that kind of speed, MongoDB
was the obvious pick — everything lived in one MongoDB replica set, with messages indexed by
channel_id and created_at.
That was a deliberate trade, not an oversight. The team knew they weren’t going to lean on MongoDB’s
sharding long-term — it was complicated and not known for stability — so they built with an easy path
to migrate later. Ship fast now, get robust when it actually matters.
T+0 (NOV 2015)
The Climb
Around 100 million stored messages, the warning signs arrived on schedule: the data and its index no
longer fit in RAM, and latencies turned unpredictable. It was time to move.
DECISION
Seven Requirements, One Database
The team wrote down what they actually needed: linear scalability without manual resharding, automatic
failover, low maintenance, proven technology, predictable performance (alarms fired if the API’s
p95 crossed 80ms), no blob-store deserialization tax, and open source.
MODELING
When 'It Can' Doesn't Mean 'It Should'
Cassandra’s compound key gave Discord a partition key (channel_id) and a clustering key. Message
IDs on Discord are Snowflakes — chronologically sortable — so message_id became that clustering key,
letting Cassandra range-scan a channel directly instead of guessing.
Then the import hit a wall: log warnings about partitions exceeding 100MB, despite Cassandra
advertising support for 2GB partitions. Large partitions pile garbage-collection pressure onto
compaction and cluster expansion, and they can’t be spread across the cluster. A single Discord channel
can run for years — that partition only grows.
DARK LAUNCH
A Required Field, Suddenly Null
Discord dual-wrote to MongoDB and Cassandra in production before fully switching over. Almost
immediately, the bug tracker filled with author_id showing up null — on a field that was supposed
to be required.
That fix surfaced a second, quieter problem: deletes in Cassandra are themselves writes — “tombstones,”
kept around for replication before being purged on compaction. Writing null generates a tombstone
exactly like a delete does. With 16 columns in the schema but only 4 typically set per message, Discord
was writing 12 needless tombstones per message. The fix: only write non-null values.
ROLLOUT
Flawless — For Six Months
Performance matched expectations exactly: sub-millisecond writes, sub-5-millisecond reads, consistent
regardless of which data was being touched. Cassandra became the primary database within a week of
launch, and MongoDB was phased out.
T+6MO
The Big Surprise
Six months in, Cassandra went unresponsive — 10-second “stop-the-world” garbage collection pauses,
with no obvious cause. The trail led to one channel taking 20 seconds to load: the Puzzles & Dragons
Subreddit’s public Discord, a server with exactly one visible message.
AFTERMATH
A Year Later
A year past the switch, Discord was running a 12-node cluster with a replication factor of 3, still
just adding nodes as data grew. Despite the tombstone incident, total volume had grown from 100
million stored messages to over 120 million sent per day, with performance holding steady. The team —
four backend engineers, no dedicated DevOps — called a low-maintenance database one of the project’s
biggest wins.
A plain-language, AI-drafted and human-edited retelling of the article published on discord.com,
reorganized and explained in our own structure and words, with original analysis in the editor's
note above. The facts, numbers, and decisions belong to the original author and are not altered.
For the full depth, read the source.