1 Comment

User's avatar
Neural Foundry's avatar

The choice to stick with JSON for SSTables early on makes sense for prototyping, but I'm curious how much the parsing overhead affects compaction when dealing with larger datasets. In production, most LSM implementations use binary formats precisely because decoding JSON on every k-way merge becomes a bottleneck. Excited to see the switch to block-based SSTables, that usualy unlocks way more optmization potential for both reads and compaction throughput.

No posts

Ready for more?