Sorry, but "50x larger than RAM" is a pretty small DB - that's an 800 GB databas...

hyc_symas · on July 3, 2023

> I am also a little surprised at the lack of a random writes benchmark.

Eh? This was 20% random writes, 80% random reads. LMDB is for read-heavy workloads.

> That is an insanely high cache hit rate, which should have probably set off your "unrepresentative benchmark" detector.

No, that is normal for a B+tree; the root page and most of the branch pages will always be in cache. This is why you can get excellent efficiency and performance from a DB without tuning to a specific workload.

pclmulqdq · on July 3, 2023

> Eh? This was 20% random writes, 80% random reads. LMDB is for read-heavy workloads.

The page says "updates," not "writes." Updates are a constrained form of write where you are writing to an existing key. Updates, importantly, do not affect your index structure, while writes do.

> No, that is normal for a B+tree; the root page and most of the branch pages will always be in cache. This is why you can get excellent efficiency and performance from a DB without tuning to a specific workload.

It is normal for a small B+tree relative to the memory size available on the machine. The "small" was the unrepresentative part of the benchmark, not the "B+tree."

hyc_symas · on July 3, 2023

> The page says "updates," not "writes." Updates are a constrained form of write where you are writing to an existing key. Updates, importantly, do not affect your index structure, while writes do.

OK, I see your point. It would only have made things even worse for LevelDB here to do an Add/Delete workload because its garbage compaction passes would have had to do a lot more work.

> It is normal for a small B+tree relative to the memory size available on the machine. The "small" was the unrepresentative part of the benchmark, not the "B+tree."

This was 100 million records, and a 5-level deep tree. To get to 6 levels deep it would be about 10 billion records. Most of the branch pages would still fit in RAM; most queries would require at most 1 more I/O than the 5-level case. The cost is still better than any other approach.