>>>>> "Ted" == Theodore Ts'o <tytso@xxxxxxx> writes: >> The issue being that TRIM is a hint and there are no hard >> guarantees. Even if a device reports DRAT/RZAT. Ted> So is this the same as how some devices will turn into bricks if Ted> you send trim commands too quickly --- i.e., they are buggy, crappy Ted> devices? Well, it's actually per spec. Even if a device reports successful completion on a DSM TRIM command it is not required to actually do anything because TRIM is a hint. The DRAT/RZAT flags indicate what the expected results are if a device decides to honor the request (or parts of it). Some devices will report zeroes only for blocks that are aligned to their internal allocation units. Whereas misaligned heads/tails of the TRIM request will contain old data, zeroes or garbage. Early SSDs would drop TRIMs under load. I think we've now moved to a world where TRIMs are mostly dropped when the FTL is in error recovery. But we have no insight into internal FTL state. Some RAID controller vendors explicitly whitelist drive models that do the right thing in their firmware to overcome this. Others rely on WRITE SAME to ensure that you don't get parity mismatches for RAID5/6. Ted> Basically, who was practicing engineering malpractice? The SSD Ted> vendors, or the T10/T13 spec authors? I think it's important to emphasize that T10/T13 specs are mainly written by device vendors. And they have a very strong objection to complicating the device firmware, keeping internal state, etc. So the outcome is very rarely in the operating system's favor. I completely agree that these flags are broken by definition. The only discard approach that provides a guaranteed result is WRITE SAME with the UNMAP bit set (i.e. SCSI only). You can also use a discard followed by a read of the block range to verify that you actually get zeroes. And then manually patch up any pieces that didn't stick. Ted> If this is a case that there is just a bunch of crap SSD's out Ted> there, then maybe we should still do this, but just not enable it Ted> by default, and force users to manually configure mount options or Ted> fstrim if they think they have devices that are competently Ted> implemented? The good news is that most devices that report DRAT/RZAT are doing the right thing due to server/RAID vendor pressure. But SSD vendors are generally not willing to give such guarantees in the datasheets. Many of these gray areas or slight enhancements to what's mandated by the T10/T13 specs are negotiated as part of a typical drive procurement process. The vendor will implement the additional features and guarantees requested by Dell/HP/IBM/Oracle/etc. Sometimes the enhancements will trickle into a later versions of the generic SSD firmware. Sometimes they won't. It's really no different from hard drives. I'd choose a server vendor branded version of a disk drive over the generic version any day. Both because of binning and because of the additional data integrity and error recovery features that are likely present in the firmware. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html