On Sat, Feb 15, 2014 at 9:47 AM, Andy Rudoff <andy@xxxxxxxxxx> wrote: > On Sat, Feb 15, 2014 at 8:04 AM, Dan Williams <dan.j.williams@xxxxxxxxx> > wrote: >> >> In response to Dave's call [1] and highlighting Jeff's attend request >> [2] I'd like to stoke a discussion on an emulation layer for atomic >> block commands. Specifically, SNIA has laid out their position on the >> command set an atomic block device may support (NVM Programming Model >> [3]) and it is a good conversation piece for this effort. The goal >> would be to review the proposed operations, identify the capabilities >> that would be readily useful to filesystems / existing use cases, and >> tear down a straw man implementation proposal. > > ... >> >> The argument for not doing this as a >> device-mapper target or stacked block device driver is to ease >> provisioning and make the emulation transparent. On the other hand, >> the argument for doing this as a virtual block device is that the >> "failed to parse device metadata" is a known failure scenario for >> dm/md, but not sd for example. > > > Hi Dan, Hi Andy. > Like Jeff, I'm a member of the NVMP workgroup and I'd like to ring in here > with a couple observations. I think the most interesting cases where > atomics provide a benefit are cases where storage is RAIDed across multiple > devices. Part of the argument for atomic writes on SSDs is that databases > and file systems can save bandwidth and complexity by avoiding > write-ahead-logging. But even if every SSD supported it, the majority of > production databases span across devices for either capacity, performance, > or, most likely, high availability reasons. The primary Facebook database server (Type 3 [1]) is single-device, are they an outlier? I would think scale-out architectures in general handle database capacity and availability by scaling at the node level... that said I don't doubt that some are dependent on multi-device configurations. [1]: http://opencompute.org/summit/ (slide 12) > So in my opinion, that very > much supports the idea of doing atomics at a layer where it applies to SW > RAIDed storage (as I believe Dave and others are suggesting). Sure this can expand to a multi-device capability, but that is incremental to the single device use case. > On the other side of the coin, I remember Dave talking about this during our > NVM discussion at LSF last year and I got the impression the size and number > of writes he'd need supported before he could really stop using his > journaling code was potentially large. Dave: perhaps you can re-state the > number of writes and their total size that would have to be supported by > block level atomics in order for them to be worth using by XFS? ...and that's the driving example of the value of having a solution like this upstream. Beat up on a common layer to determine the minimum practical requirements across different use cases. > Finally, I think atomics for file system use is interesting, but also > exposing them for database use is very interesting. That means exposing the > size and number of writes supported to the app and making the file system > able to turn around and leverage those when a database app tries to use them > via the file system. This has been the primary focus of the NVMP workgroup, > helping ISVs determine what features they can leverage in a uniform way. So > my point here is we get the most use out of atomics by exposing them both > in-kernel for file systems and in user space for apps. *nod* -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html