Christoph, > Except for SSDs it generally doesn't - the fact that they are written > at the same time means there is a very high chance they will end up > on media together for traditional SSDs designs. > > This might be different when explicitly using some form of data > placement scheme, and SSD vendors might be able to place PI/metadata > different under the hood when using a big enough customer aks for it > (they might not be very happy about the request :)). There was a multi-vendor effort many years ago (first gen SSD era) to make vendors guarantee that metadata and data would be written to different channels. But performance got in the way, obviously. > One thing that I did implement for my XFS hack/prototype is the ability > to store a crc32c in the non-PI metadata support by nvme. This allows > for low overhead data checksumming as you don't need a separate data > structure to track where the checksums for a data block are located and > doesn't require out of place writes. It doesn't provide a reg tag > equivalent or device side checking of the guard tag unfortunately. That sounds fine. Again, I don't have a problem with having the ability to choose whether checksum placement or WAF is more important for a given application. > I never could come up with a good use of the app_tag for file systems, > so not wasting space for that is actually a good thing. I wish we could just do 4 bytes of CRC32C + 4 bytes of ref tag. I think that would be a reasonable compromise between space and utility. But we can't do that because of the app tag escape. We're essentially wasting 2 bytes per block to store a single bit flag. In general I think 4096+16 is a reasonable format going forward. With either CRC32C or CRC64 plus full LBA as ref tag. -- Martin K. Petersen Oracle Linux Engineering