Re: Thin device provisioning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 12, 2008 at 04:38:48PM -0400, Knight, Frederick wrote:
> I don't see how it doesn't match T13 TRIM command?  Both can do single
> ranges.  In both cases, you can have 1 LBA and 1 length.  There is
> nothing requiring > 1 range to be sent via the SCSI proposal.  In both
> cases, you pass the same values to the H/W driver.  In one H/W driver it
> will load a bunch of values (including the LBA/length) into a set of
> registers (PATA) of a memory structure (SATA).  In the other H/W driver,
> it will load a bunch of values into memory structures (CDB/buffer), and
> then tweek the H/W to send the memory structures.

If you consider a SATL implemented in an array device, it can receive a
PUNCH command with multiple ranges.  It must then send multiple TRIM
commands, one for each range.

The proposal also suboptimal if the common case is just one range.  The SCSI
driver has to allocate a 20-byte block and do a DATA OUT command.

> Most SCSI drivers I've seen that have tagged queuing enabled turn off
> their elevator algorithms (since the drive itself is doing it's own
> optimizations)

In Linux, we try not to have elevators in the device drivers themselves
(though I believe there are still a few which have their own).  Instead we
have an elevator in the block layer where typically we have much more
information about which IOs can be merged and which IOs cannot pass
each other, which OS process submitted the IO (and hence can do fair
scheduling between different users) and so on.

Each request queue (~= SCSI LUN) can choose which elevator controls its
behaviour, so if it works out better to have the drive do the scheduling,
it can be disabled by switching to the noop elevator.

> There is no difference at the filesystem de-allocator level.  The only
> difference is how the H/W sends the values to the other end of the wire,
> and there will always be differences at that layer. 

I think Dave's point is that batching all the discards together into one
list isn't a natural interface for a filesystem; they prefer an
interface which is a single extent.


I'll make a counter-proposal though ... we rename all the commands in
08-149r0 to PUNCH MULTI, ERASE MULTI, etc and add single-(LBA, length)
versions of them.  What do you think?

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux