On Wed, Aug 10, 2016 at 11:50 AM, Tom Yan <tom.ty89@xxxxxxxxx> wrote: > On 10 August 2016 at 14:34, Shaun Tancheff <shaun.tancheff@xxxxxxxxxxx> wrote: >> >> You are correct in that we can advertise the larger limit in >> ata_scsi_dev_config() when only SCT write same is supported >> rather than fall back to WS10. > > ata_scsi_dev_config()? Not sure if I follow. We should only need to > report Maximum Write Same Length in the Block Limit VPD > (ata_scsiop_inq_b0). > >> >> TRIM is bound by an interface maximum. You can only stuff 64 entries >> of a 16 bit length followed by 48 bit lba into a 512 byte block. > > Well that is actually the minimum. Modern SSDs often support more than > one-block payload (e.g. 8, 16...). It's just our SCSI disk driver > statically limit it to the minimum. Though it allows only 0xffffffff / > 512 = 8388607 (SD_MAX_WS16_BLOCKS) blocks per WRITE SAME (16) command > anyway, so we can at most allow only a 2-block (well, or 3-block) > payload. Ah. Thanks for the clarification. >> SCT is not restricted (you can wipe an entire drive) however there >> is a practical limit in that I have coded the SCT to operate >> in the foreground so the command could timeout depending >> on how fast the media can write. >> >> On my machine the default timeout is 30s so to clear 4194240 (16G): > > You are talking about an AF 4Kn drive I suppose? For a 512e drive it > should be only ~2G. I stand corrected. Since all the kernel math is 512 byte sectors you are absolutely correct and this isn't an issue at all. We should report SD_MAX_WS16_BLOCKS when only SCT is available and 4194240 when TRIM is available. You can safely ignore the remainder of my pointless rambling. Thanks for you patience none the less. >> 30s -> 547 MB/s >> 60s -> 274 MB/s >> 90s -> 183 MB/s >> 120s -> 137 MB/s >> >> So for my drives 8G and 30s or 16G and 60s is fine. >> For older or slow drives 4G and 30s should be fine. >> >> I really am not sure what would be considered the correct >> solution though. I believe that the WRITE SAME defaults >> are currently being chosen around physical limits. > > Not sure about what WRITE SAME defaults and physical limits you are > referring to. Just that the WRITE SAME limit SD_MAX_WS16_BLOCKS is derived from the request interface as opposed to some other arbitrary limit. >> >> We could reduce the trim to 16 entries when SCT is available and >> bump SCT to the same 16 * 63335 maximum? > > I am not sure if that's a good idea. Small TRIM payloads (hence more > TRIM commands) could lead to noticeable overhead in my experience. But > if 4194240 blocks is really too many for SCT Write Same in any case, I > guess we will have to compromise, since the Maximum Write Same Length > field is shared. (Now it feels unfortunate that we decided to switch > from UNMAP -> TRIM to WRITE SAME (16) -> TRIM long ago.) The question > is, do we want the value to stay at 4194240 when SCT Write Same is not > available? > > I have no idea what the value should be. But, given the fact sector > size seems to matter much in the SCT case, perhaps at the very least, > we would want to derive the multiplier from that? > >> >> I think we can also bump the command timeout for WRITE SAME? > > I have no idea where the timeout comes from. Is it even a thing in the > kernel (instead of one in the firmware of the drive or the ACS > standard)? Oh ... the timeout I was thinking of is this (or defaulted from): /sys/block/sdX/device/timeout It's the struct request_queue's 'rq_timeout' At least that's where my line of thought was going. I will update the patch to use SD_MAX_WS16_BLOCKS and 4194240 as appropriate. Regards, Shaun -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html