Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> But is the overhead of copying read-only data from secondary storage to RAM really such a big problem

It's not a problem that will affect every developer but it is a critical problem.

Many uses cases need access to data in both a fast (~ memory) and durable (~ SSD) way.

Copying data from SSD to memory helps with performance but the minute you modify that data then you've lost durability. Databases, file systems, caches etc are examples of critical use cases that are used by all developers that would benefit from something like Optane.



What percentage (say) speed improvement would you expect for typical workloads on database, file systems, etc. from a unified memory system?


So there are only a few examples of existing systems being re-written for Optane.

There was Redis [1] which was around ~10% slower than the memory only version but gained you durability. And there was an implementation of Spark Shuffle [2] which was 1.3x - 3.2x faster but that isn't really stressing I/O as much as other use cases.

For a filesystem you can store the entire metadata in Optane so EXISTS/LIST type operations and anything involving a bloom filter would see the full benefit e.g. order of magnitude better than NVME.

[1] https://github.com/pmem/pmem-redis

[2] https://databricks.com/session_na20/accelerating-apache-spar...


I'm sure it's amazing for heavily loaded databases but that's a pretty small fraction of computers.

For a filesystem I'm less sure about the benefits. Most of the waiting time I see is for the CPU, even if the information is already in memory. What I need is better code implementing the filesystem, not a hardware bump. And even if you go all-in on optane metadata, you only need to replace 1% of your NAND.

I do think there's some really nice potential, but almost all of what I'm interested in can be done with tiny amounts.


Some things are amazingly slow, though; look at how long it takes to install software, for instance. That's rarely CPU-bound at all, but somehow is influenced by I/O mismanagement in the various installers.


Installing software is usually CPU bound on Windows as decompression, anti-malware scanning, and signature validation of every file write limits throughput. Filesystem metadata management is also a bottleneck.

Other platforms may not have the anti-malware scan but do similar things.


Wow, the Redis case is even worse than I expected! I would've thought maybe a 2x improvement would be normal.

Also I think you're conflating workloads with operations. Sure, the occasional operation like EXISTS and LIST operation might be faster, but surely most workloads do a lot more than checking a trillion things for existence?

I feel like everything you wrote just makes the case against Optane better than even I could. There seems to be little if any clear performance benefit (being generous here, given slowdowns are also possible as you noted!) to most people's workloads to justify upending everyone's current model of computing. Something like this would probably need to deliver at least an order of magnitude of visible speedup in typical use cases for people to consider it. Which isn't to say some niche workloads might not see 2+ orders of magnitude performance improvement, but the rest of the world clearly won't; why should they have to pay the price for niche use cases?


Why would you expect Optane Redis to be faster than normal Redis ?

Redis is an in-memory system where you are compromising performance for durability. Optane Redis gets you the best of both worlds.

Examples where you are comparing Optane against SSD is where you do see significant improvements especially for smaller bits of data.

And EXISTS/LIST operations are more than just "occasional" operations for data storage systems.


> Why would you expect Optane Redis to be faster than normal Redis ?

Sorry for the confusion. To clarify, my sentences were separate; I wasn't saying I expected 2x for Redis specifically. I was just saying I didn't expect a 10% slowdown for Redis, and that I expected a 2x improvement typically (not necessarily for Redis).

> And EXISTS/LIST operations are more than just "occasional" operations for data storage systems.

Again, communication issue. I wrote "occasional" in the sense of "a small set of operations", not "infrequent operations". As in, you're going through a list of all operations a DB supports, and occasionally one pops out as potentially substantially benefiting from Optane.

Regardless, your argument misses the point I'm making. The point was: how much of the total workload time do they take up. Even if Optane brought down EXISTS/LIST latency to zero, your workload (including all OS/network/client/etc. overhead) would literally have to be 90% composed of (i.e. overwhelmingly dominated by) EXISTS/LIST checks to get an order of magnitude speed improvement for the user.


> I didn't expect a 10% slowdown for Redis

Pretty sure you misunderstood. If you forced redis use an SSD (persistent storage) for everything redis normally uses DRAM for and only observed a 10% slowdown, it would be a goddamn miracle!

> how much of the total workload time do they take up

If you're talking about a read-heavy workload, the only good thing about Optane is that it's a little cheaper than DRAM. But those workloads are easy to scale (just buy 2x caches to get 2x throughout) so they're often not worth discussing.

Also reposting my comment from above:

> PCIe Optane was a thing and it achieved 10us latency whereas today's fastest SSDs get 40us. IIRC the DIMM version of Optane was <1us, literally an order of magnitude faster!


I didn't look into details for the Redis port to Optane, but work on GemFire which is a similar in-memory system. I wrote expect one big benefit not captured by the 10% number to be startup time. Redis uses a operations log. On startup you'd have to replay the operations which can take a while and you have to clean up the log periodically. So startup / recovery should be quicker and you now have a whole source of complexity and bugs you don't need to worry about because hardware and OS should solve it for you.


Sure, but again, that is clearly far, far too minor of a benefit to justify the radical change we are proposing to everyone's model of computing.


I'm not sure it's all that radical. Sure, you can think of ways to completely clean slate redesign various applications and systems. But equally there are reasonably straightforward ways to integrate it into existing systems to achieve good speedups.


Are changes to everyones model of computing actually radical? I mean, I understand that taking advantage of the fact that memory actually persists is actually somewhat non-trivial when done via byte-addressable accesses, and that even with libraries that will be doing these things, developers gain a new set of properties to worry about, but... isn't this all optional? Why can't software treat persistent memory as just a lot denser RAM?


I recommend looking at this presentation from Oracle for an example of the benefits of PMem: "Under the Hood of an Exadata Transaction – How We Harnessed the Power of Persistent Memory"

https://www.youtube.com/watch?v=ertF5ZwCHP0




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: