Jeff, > We've always been told "don't worry about what the internal block size > is, that only matters to the FTL." That's obviously not true, but > when devices only report a 512 byte granularity, we believe them and > will issue discard for the smallest size that makes sense for the file > system regardless of whether it makes sense (internally) for the SSD. > That means 4k for pretty much anything except btrfs metadata nodes, > which are 16k. The devices are free to report a bigger discard granularity. We already support and honor that (for SCSI, anyway). It's completely orthogonal to reported the logical block size, although it obviously needs to be a multiple. The real problem is that vendors have zero interest in optimizing for discard. They are so confident in their FTL and overprovisioning that they don't view it as an important feature. At all. Consequently, many of the modern devices that claim to support discard to make us software folks happy (or to satisfy a purchase order requirements) complete the commands without doing anything at all. We're simply wasting queue slots. Personally, I think discard is dead on anything but the cheapest devices. And on those it is probably going to be performance-prohibitive to use it in any other way than a weekly fstrim. -- Martin K. Petersen Oracle Linux Engineering