On Thu, Apr 24, 2014 at 11:03 AM, Chris Mason <clm@xxxxxx> wrote: > On 04/24/2014 01:39 PM, Matthew Wilcox wrote: >> >> >> NVMe allows the drive to tell the host what atomicity guarantees it >> provides for a write command. At the moment, I don't think Linux has >> a way for the driver to pass that information up to the filesystem. >> >> The value that is most interesting to report is Atomic Write Unit Power >> Fail ("if you send a write no larger than this, the drive guarantees to >> write all of it or none of it"), minimum value 1 sector. [1] >> >> There's a proposal before the NVMe workgroup to add a boundary size/offset >> to modify AWUPF ("except if you cross this boundary, then AWUPF is not >> guaranteed"). Think RAID stripe crossing. >> >> So, three questions. Is there somewhere already to pass boundary >> information up to the filesystem? Can filesystems make use of a larger >> atomic write unit than a single sector? And, if the device is internally >> a RAID device, is knowing the boundary size/offset useful? >> >> >> [1] There is also Atomic Write Unit Normal ("if you send two writes, >> neither of which is larger than this, subsequent reads will get either >> one or the other, not a mixture of both"), which I don't think we care >> about because the page cache prevents us from sending two writes which >> overlap with each other. > > > I think we really need the atomics to be vectored. Send N writes which as a > unit are not larger than X, but which may span anywhere on device. An array > with writeback cache, or a log structured squirrel in the FTL should be able > to provide this pretty easily? > > The immediate use case is mysql (16K writes) on a fragmented filesystem. > The FS needs to be able to collect a single atomic write made up of N 4K > sectors. How big does N need to be before it starts to be generally useful? Here it seems we're talking on the order to tens of writes, but for the upper bound Dave said that N could be in the hundreds of thousands [1]. -- Dan [1]: http://marc.info/?l=linux-fsdevel&m=139262740324307&w=2 -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html