I under understood what problem single-level storage was actually supposed to solve. In the real world you still need a notion of startup and shutdown because programs crash, memory gets corrupted, things get screwed up, hardware changes, etc., so it's not like we were going to avoid those. So, from a technical standpoint, why exactly would I want to merge my RAM with my SSD? The only thing I can think of is that it lets you achieve genuine "zero-copy" usage of data. Which, I mean, is cool. But is the overhead of copying read-only data from secondary storage to RAM really such a big problem for >90% of users to be justification for changing everyone's very notion of computing?
> But is the overhead of copying read-only data from secondary storage to RAM really such a big problem
It's not a problem that will affect every developer but it is a critical problem.
Many uses cases need access to data in both a fast (~ memory) and durable (~ SSD) way.
Copying data from SSD to memory helps with performance but the minute you modify that data then you've lost durability. Databases, file systems, caches etc are examples of critical use cases that are used by all developers that would benefit from something like Optane.
So there are only a few examples of existing systems being re-written for Optane.
There was Redis [1] which was around ~10% slower than the memory only version but gained you durability. And there was an implementation of Spark Shuffle [2] which was 1.3x - 3.2x faster but that isn't really stressing I/O as much as other use cases.
For a filesystem you can store the entire metadata in Optane so EXISTS/LIST type operations and anything involving a bloom filter would see the full benefit e.g. order of magnitude better than NVME.
I'm sure it's amazing for heavily loaded databases but that's a pretty small fraction of computers.
For a filesystem I'm less sure about the benefits. Most of the waiting time I see is for the CPU, even if the information is already in memory. What I need is better code implementing the filesystem, not a hardware bump. And even if you go all-in on optane metadata, you only need to replace 1% of your NAND.
I do think there's some really nice potential, but almost all of what I'm interested in can be done with tiny amounts.
Some things are amazingly slow, though; look at how long it takes to install software, for instance. That's rarely CPU-bound at all, but somehow is influenced by I/O mismanagement in the various installers.
Installing software is usually CPU bound on Windows as decompression, anti-malware scanning, and signature validation of every file write limits throughput. Filesystem metadata management is also a bottleneck.
Other platforms may not have the anti-malware scan but do similar things.
Wow, the Redis case is even worse than I expected! I would've thought maybe a 2x improvement would be normal.
Also I think you're conflating workloads with operations. Sure, the occasional operation like EXISTS and LIST operation might be faster, but surely most workloads do a lot more than checking a trillion things for existence?
I feel like everything you wrote just makes the case against Optane better than even I could. There seems to be little if any clear performance benefit (being generous here, given slowdowns are also possible as you noted!) to most people's workloads to justify upending everyone's current model of computing. Something like this would probably need to deliver at least an order of magnitude of visible speedup in typical use cases for people to consider it. Which isn't to say some niche workloads might not see 2+ orders of magnitude performance improvement, but the rest of the world clearly won't; why should they have to pay the price for niche use cases?
> Why would you expect Optane Redis to be faster than normal Redis ?
Sorry for the confusion. To clarify, my sentences were separate; I wasn't saying I expected 2x for Redis specifically. I was just saying I didn't expect a 10% slowdown for Redis, and that I expected a 2x improvement typically (not necessarily for Redis).
> And EXISTS/LIST operations are more than just "occasional" operations for data storage systems.
Again, communication issue. I wrote "occasional" in the sense of "a small set of operations", not "infrequent operations". As in, you're going through a list of all operations a DB supports, and occasionally one pops out as potentially substantially benefiting from Optane.
Regardless, your argument misses the point I'm making. The point was: how much of the total workload time do they take up. Even if Optane brought down EXISTS/LIST latency to zero, your workload (including all OS/network/client/etc. overhead) would literally have to be 90% composed of (i.e. overwhelmingly dominated by) EXISTS/LIST checks to get an order of magnitude speed improvement for the user.
Pretty sure you misunderstood. If you forced redis use an SSD (persistent storage) for everything redis normally uses DRAM for and only observed a 10% slowdown, it would be a goddamn miracle!
> how much of the total workload time do they take up
If you're talking about a read-heavy workload, the only good thing about Optane is that it's a little cheaper than DRAM. But those workloads are easy to scale (just buy 2x caches to get 2x throughout) so they're often not worth discussing.
Also reposting my comment from above:
> PCIe Optane was a thing and it achieved 10us latency whereas today's fastest SSDs get 40us. IIRC the DIMM version of Optane was <1us, literally an order of magnitude faster!
I didn't look into details for the Redis port to Optane, but work on GemFire which is a similar in-memory system. I wrote expect one big benefit not captured by the 10% number to be startup time. Redis uses a operations log. On startup you'd have to replay the operations which can take a while and you have to clean up the log periodically. So startup / recovery should be quicker and you now have a whole source of complexity and bugs you don't need to worry about because hardware and OS should solve it for you.
I'm not sure it's all that radical. Sure, you can think of ways to completely clean slate redesign various applications and systems. But equally there are reasonably straightforward ways to integrate it into existing systems to achieve good speedups.
Are changes to everyones model of computing actually radical? I mean, I understand that taking advantage of the fact that memory actually persists is actually somewhat non-trivial when done via byte-addressable accesses, and that even with libraries that will be doing these things, developers gain a new set of properties to worry about, but... isn't this all optional? Why can't software treat persistent memory as just a lot denser RAM?
I recommend looking at this presentation from Oracle for an example of the benefits of PMem:
"Under the Hood of an Exadata Transaction – How We Harnessed the Power of Persistent Memory"
Yeah I think you would still need an "erase all caches and reboot" option but there's no theoretical reason you couldn't have that. The main reason you have to restart desktop OSes so often is because they're ancient and don't isolate components properly. How often do you have to restart your phone? A couple of times a year maybe?
But I agree with your point - it does seem like a very cool idea but practically wouldn't make a huge amount of difference and basically requires an entirely new incompatible OS.
I wonder if it would have been more successful in phones actually. iOS already doesn't have a user-accessible filesystem and Android is moving in that direction.
> How often do you have to restart your phone? A couple of times a year maybe?
Me? Like once a month at least, probably twice. Could be as frequent as multiple times in an hour. Just depends on the reason. It could include anything from "battery ran out before I charged" (a few times a year maybe) to "my phone is crashing/behaving erratically" (could be every few weeks) to "Android didn't refresh its MTP database live and won't show the file I added till I reboot" (could be from 5 mins ago). Not an exhaustive set, just listing a few examples.
> I wonder if it would have been more successful in phones actually.
Interesting idea. Maybe? I wonder how much the performance improvement would be for the end-user.
Ok that's not common by any means. Maybe a custom ROM or a new phone is in order?
Personally I reboot my iphone once a year. It also reboots overnight every 3 months or so for software updates, but that's often not observable because apps restore their state.
Not due to the above (the reboot need isn't frequent enough to bother me), but for other reasons, possibly? It's not high priority for me but I've been thinking about it. Thing is, I love my phone otherwise. An insane amount of resources go into making hardware like this, and the planet's already trashed enough as is; I hate throwing out hardware that works fine just for random software glitches I can easily put up with.
> iOS already doesn't have a user-accessible filesystem
iOS actually did add some sort of native file explorer some time ago – no idea how comprehensive it is, but I guess it shows that even Apple couldn't entirely get rid of this.
> and Android is moving in that direction.
… and I absolutely hate it. Though I think it's not so much getting rid of files, as simply a half-assed attempt at sandboxing with a completely incompatible new API, various bugs (performance and otherwise), and breakage (flexibly exchanging multi-file file format files [1] between arbitrary apps is more or less dead if you follow the new rules, though in that case no sandboxing solution on any OS seems to get that right – as far as I'm aware only macOS even attempts to offer some sort of solution for that problem, but even that only solves part of the problem).
[1] Like locally mirrored HTML files with multiple pages or separately stored subresources (JS/CSS/media files/…), or movies + subtitles, or multi-part archives, or…