On Sun, May 03, 2009 at 02:34:35PM -0400, Jeff Garzik wrote: > Yes, the task of figuring out -what to do- in the queue's request > function is quite complex, and discard makes it even more so. > > The API makes life difficult -- you have to pass temporary info to > yourself in ->prepare_flush_fn() and ->prepare_discard_fn(), and the > overall sum is a bewildering collection of opcodes, flags, and internal > driver notes to itself. > > Add to this yet another prep function, ->prep_rq_fn() > > It definitely sucks, especially with regards to failed atomic > allocations... but I think fixing this quite a big more than Matthew > probably willing to tackle ;-) I'm completely confused by the block layer API, to be honest. Trying to deduce how to add a new feature at this stage is hard (compare it to adding the reflink operation to the VFS ...). I'm definitely willing to tackle changing the block device API, but it may take a while. > My ideal block layer interface would be a lot more opcode-based, e.g. > > (1) create REQ_TYPE_DISCARD > > (2) determine at init if queue (a) supports explicit DISCARD and/or (b) > supports DISCARD flag passed with READ or WRITE > > (3) when creating a discard request, use block helpers w/ queue-specific > knowledge to create either > (a) one request, REQ_TYPE_FS, with discard flag or > (b) two requests, REQ_TYPE_FS followed by REQ_TYPE_DISCARD I'm not sure we need option 3b. > (4) blkdev_issue_discard() would function like an empty barrier, and > unconditionally create REQ_TYPE_DISCARD. I can certainly prototype a replacement for discard_prep_fn along these lines. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html