On Tue, Aug 12, 2008 at 04:38:48PM -0400, Knight, Frederick wrote: > I don't see how it doesn't match T13 TRIM command? Both can do single > ranges. In both cases, you can have 1 LBA and 1 length. There is > nothing requiring > 1 range to be sent via the SCSI proposal. In both > cases, you pass the same values to the H/W driver. In one H/W driver it > will load a bunch of values (including the LBA/length) into a set of > registers (PATA) of a memory structure (SATA). In the other H/W driver, > it will load a bunch of values into memory structures (CDB/buffer), and > then tweek the H/W to send the memory structures. If you consider a SATL implemented in an array device, it can receive a PUNCH command with multiple ranges. It must then send multiple TRIM commands, one for each range. The proposal also suboptimal if the common case is just one range. The SCSI driver has to allocate a 20-byte block and do a DATA OUT command. > Most SCSI drivers I've seen that have tagged queuing enabled turn off > their elevator algorithms (since the drive itself is doing it's own > optimizations) In Linux, we try not to have elevators in the device drivers themselves (though I believe there are still a few which have their own). Instead we have an elevator in the block layer where typically we have much more information about which IOs can be merged and which IOs cannot pass each other, which OS process submitted the IO (and hence can do fair scheduling between different users) and so on. Each request queue (~= SCSI LUN) can choose which elevator controls its behaviour, so if it works out better to have the drive do the scheduling, it can be disabled by switching to the noop elevator. > There is no difference at the filesystem de-allocator level. The only > difference is how the H/W sends the values to the other end of the wire, > and there will always be differences at that layer. I think Dave's point is that batching all the discards together into one list isn't a natural interface for a filesystem; they prefer an interface which is a single extent. I'll make a counter-proposal though ... we rename all the commands in 08-149r0 to PUNCH MULTI, ERASE MULTI, etc and add single-(LBA, length) versions of them. What do you think? -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html