James Bottomley wrote: >>>> No, it doesn't. Drain needs to extend the sg table too >>> The patches do extend the sg table. >> Hmm... Where? > > It's the block component; it's already in git head with this patch: > > commit fa0ccd837e3dddb44c7db2f128a8bb7e4eabc21a > Author: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> > Date: Thu Jan 10 11:30:36 2008 -0600 > > block: implement drain buffers > > The description is pretty reasonable. The way it works is by making > block add the drain buffer to the sg table as the last element. > > The way it works is if you simply use the length block gives you, you > find yourself doing DMA to every element in the sg table except the last > one. If you expand the length to anything up to blk_rq_drain_size() you > then use the last element for the drain. Oops, I was talking about padding. Sorry about that. >> Adjusting qc->nbytes isn't enough. libata will have to trim at the end >> of sglist. With both draining and padding, it means libata will have to >> trim one sg entry and a few more bytes. Block layer is behaving >> differently and in a very noticeable way, it's sending inconsistent >> values in data_len and sg list. > > No, that's the point. With all the space available, you don't need to > trim at all ... you simply don't use it. The rationale is the way the > sg element stuff works. If you program a DMA table (PRD table in IDE > speak) into a controller, it doesn't care if you don't use the last few > pieces. With the default length set, the DMA controller won't use the > last programmed element, but it's their if you expand the length or tell > it to for an overrun. Not all controllers have a place to set 'default length'. They just run till sg list terminates. >> Okay, I see your point. The biggest problem is that by doing that blk >> layer breaks a pretty important rule - keeping data len consistent with >> sg list and the bugs it's gonna cause will be very subtle and dangerous. >> This is definitely where we can spend four extra bytes. > > The lengths don't have to be consistent (otherwise we wouldn't need to > calculate them separately) the request reported length just has to be > less than the sg table actual length. We even make use of this property > in SCSI when we can't fit a request into a single command ... we are > allowed to chunk up the request as we complete it. I'm not saying it's > best practice, but it is acceptable behaviour. libata definitely isn't ready for that and I don't think expanding the behavior to whole block layer is a wise thing to do when the best practice can be bought at the cost of 4 bytes per request. No? -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html