On Thu, Aug 25, 2016 at 1:31 AM, Tom Yan <tom.ty89@xxxxxxxxx> wrote: > On 25 August 2016 at 05:28, Shaun Tancheff <shaun.tancheff@xxxxxxxxxxx> wrote: >> On Wed, Aug 24, 2016 at 12:31 AM, Tom Yan <tom.ty89@xxxxxxxxx> wrote: >>> On 24 August 2016 at 11:33, Martin K. Petersen >>> <martin.petersen@xxxxxxxxxx> wrote: >>>>>>>>> "Tom" == Tom Yan <tom.ty89@xxxxxxxxx> writes: >>>> >>>> Tom> Nope, SCSI Write Same commands does not have payload (or in SCSI >>>> Tom> terms, parameter list / data-out buffer). >>>> >>>> WRITE SAME has a a payload of 1 logical block (unless NDOB is set but we >>>> have had no good reason to support that yet). >>> >>> Interesting, I wasn't aware of the bit. I just didn't see any >>> parameter list defined for any of the Write Same commands. Ah wait, it >>> carries the pattern (the "same") and so. >>> >>> Hmm, it doesn't seem like the translation implemented in this patch >>> series cares about the payload though? >> >> As repeated here and elsewhere the payload is: >> scsi_sglist(cmd) >> and it was put there by scsi_init_io() when it called scsi_init_sgtable() > > What I mean is: > > put_unaligned_le32(0u, &sctpg[10]); > > So even if the payload of the SCSI Write Same commands instruct a > non-zero pattern writing, SCT Write Same will conveniently ignore that > do zero-filling anyway. That's what I mean by "doesn't care about the > payload". If you would like to add support for that it would be nice. I am not planning to do so here. > Though that would only be case with SG_IO though. SCSI Write Same > issued from block layer (BLKZEROOUT) will be instructing zero-filling > anyway. >>>> UNMAP has a payload that varies based on the number of range >>>> descriptors. The SCSI disk driver only ever issues a single descriptor >>>> but since libata doesn't support UNMAP this doesn't really come into >>>> play. >>>> >>>> Ideally there would be a way to distinguish between device limits for >>>> WRITE SAME with the UNMAP bit and for "regular" WRITE SAME. One way to >>>> do that would be to transition the libata discard implementation over to >>>> single-range UNMAP, fill out the relevant VPD page B0 fields and leave >>>> the WRITE SAME bits for writing zeroes. >>>> >>>> One reason that has not been particularly compelling is that the WRITE >>>> SAME payload buffer does double duty to hold the ATA DSM TRIM range >>> >>> Huh? Why would the SATL care about its payload buffer for TRIM (i.e. >>> when the UNMAP bit is set)? Doesn't it just read the LBA and NUMBER OF >>> BLOCKS field and pack TRIM ranges/payload according to that? >>> >>>> descriptors and matches the required ATA payload size. Whereas the UNMAP >>> >>> Why would it need to "matches the required ATA payload size"? >>> >>>> command would only provide 24 bytes of TRIM range space. >>> >>> I don't really follow. The UNMAP descriptor has LBA (8 bytes / 64-bit) >>> and NUMBER OF BLOCKS (4 bytes / 32-bit) field of the same length as >>> Write Same (16). Even if the SCSI disk driver will only send one >>> descriptor, it should work as good as Write Same (16). >> >> The "payload" is the data block transferred with the command. >> The "descriptor" is, in this context, the contents of the payload as >> it "describes" what the action the command is supposed to perform. >> > > I know right. > >> The "payload" contains the "descriptor" that "describes" how >> DSM TRIM should determine which logical blocks it should UNMAP. > > This should only be the case of UNMAP command, but not SCSI Write Same > with UNMAP bit set. And either way it should not affect how large the > ATA TRIM payload can be. The current SATL does not report support for UNMAP. If you think it should be added please submit a patch. If you would like to extend the current translate to support sending multiple blocks of trim data please submit a patch. >>>> Also, please be careful with transfer lengths, __data_len, etc. As >>>> mentioned, the transfer length WRITE SAME is typically 512 bytes and >>>> that's the number of bytes that need to be DMA'ed and transferred over >>>> the wire by the controller. But from a command completion perspective we >>>> need to complete however many bytes the command acted upon. Unlike reads >>>> and writes there is not a 1:1 mapping between the transfer length and >>>> the affected area. So we do a bit of magic after the buffer has been >>>> mapped to ensure that the completion byte count matches the number of >>>> blocks that were affected by the command rather than the size of the >>>> data buffer in bytes. >>>> >>>> -- >>>> Martin K. Petersen Oracle Linux Engineering >> -- >> Shaun Tancheff -- Shaun Tancheff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html