FUJITA Tomonori wrote: > From: Jens Axboe <jens.axboe@xxxxxxxxxx> > Subject: Re: [PATCH 4/4] bidi support: bidirectional request > Date: Mon, 30 Apr 2007 13:11:57 +0200 > >> On Sun, Apr 29 2007, James Bottomley wrote: >>> On Sun, 2007-04-29 at 18:48 +0300, Boaz Harrosh wrote: >>>> FUJITA Tomonori wrote: >>>>> From: Boaz Harrosh <bharrosh@xxxxxxxxxxx> >>>>> Subject: [PATCH 4/4] bidi support: bidirectional request >>>>> Date: Sun, 15 Apr 2007 20:33:28 +0300 >>>>> >>>>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h >>>>>> index 645d24b..16a02ee 100644 >>>>>> --- a/include/linux/blkdev.h >>>>>> +++ b/include/linux/blkdev.h >>>>>> @@ -322,6 +322,7 @@ struct request { >>>>>> void *end_io_data; >>>>>> >>>>>> struct request_io_part uni; >>>>>> + struct request_io_part bidi_read; >>>>>> }; >>>>> Would be more straightforward to have: >>>>> >>>>> struct request_io_part in; >>>>> struct request_io_part out; >>>>> >>>> Yes I wish I could do that. For bidi supporting drivers this is the most logical. >>>> But for the 99.9% of uni-directional drivers, calling rq_uni(), and being some what on >>>> the hotish paths, this means we will need a pointer to a uni request_io_part. >>>> This is bad because: >>>> 1st- There is no defined stage in a request life where to definitely set that pointer, >>>> specially in the preparation stages. >>>> 2nd- hacks like scsi_error.c/scsi_send_eh_cmnd() will not work at all. Now this is a >>>> very bad spot already, and I have a short term fix for it in the SCSI-bidi patches >>>> (not sent yet) but a more long term solution is needed. Once such hacks are >>>> cleaned up we can do what you say. This is exactly why I use the access functions >>>> rq_uni/rq_io/rq_in/rq_out and not open code access. >>> I'm still not really convinced about this approach. The primary job of >>> the block layer is to manage and merge READ and WRITE requests. It >>> serves a beautiful secondary function of queueing for arbitrary requests >>> it doesn't understand (REQ_TYPE_BLOCK_PC or REQ_TYPE_SPECIAL ... or >>> indeed any non REQ_TYPE_FS). >>> >>> bidirectional requests fall into the latter category (there's nothing >>> really we can do to merge them ... they're just transported by the block >>> layer). The only unusual feature is that they carry two bios. I think >>> the drivers that actually support bidirectional will be a rarity, so it >>> might even be advisable to add it to the queue capability (refuse >>> bidirectional requests at the top rather than perturbing all the drivers >>> to process them). >>> >>> So, what about REQ_TYPE_BIDIRECTIONAL rather than REQ_BIDI? That will >>> remove it from the standard path and put it on the special command type >>> path where we can process it specially. Additionally, if you take this >>> approach, you can probably simply chain the second bio through >>> req->special as an additional request in the stream. The only thing >>> that would then need modification would be the dequeue of the block >>> driver (it would have to dequeue both requests and prepare them) and >>> that needs to be done only for drivers handling bidirectional requests. >> I agree, I'm really not crazy about shuffling the entire request setup >> around just for something as exotic as bidirection commands. How about >> just keeping it simple - have a second request linked off the first one >> for the second data phase? So keep it completely seperate, not just >> overload ->special for 2nd bio list. >> >> So basically just add a struct request pointer, so you can do rq = >> rq->next_rq or something for the next data phase. I bet this would be a >> LOT less invasive as well, and we can get by with a few helpers to >> support it. > > This patch tried this approach. It's just for seeing how it works. > > I added bidi support to open-iscsi and bsg and tested this patch > lightly. I've attached only a patch for the block layer and scsl-ml. > You can get all the patches are: > > http://www.kernel.org/pub/linux/kernel/people/tomo/bidi > > If we go with this approach, we need just minor changes to the block > layer. The overloading rq->special approach needs more but it's > reasonable too. > > I need to add the proper error handling code, which might be a bit > tricky, but I think that it will not be so complicated. > > >> And it should definitely be a request type. > > I'm not sure about this. I think that bidi can't be a request type to > trace bidi pc requests (we have bidi special requests like SMP). I use > REQ_BIDI though I've not implemented bidi trace code. > > > From 7d278323ff8aad86fb82c823538f7ddfb6ded11c Mon Sep 17 00:00:00 2001 > From: FUJITA Tomonori <fujita.tomonori@xxxxxxxxxxxxx> > Date: Wed, 2 May 2007 03:55:56 +0900 > Subject: [PATCH] add bidi support > > Signed-off-by: FUJITA Tomonori <fujita.tomonori@xxxxxxxxxxxxx> > --- > block/ll_rw_blk.c | 1 + > drivers/scsi/scsi_lib.c | 72 +++++++++++++++++++++++++++++++++++++++------- > include/linux/blkdev.h | 7 ++++ > include/scsi/scsi_cmnd.h | 9 ++++++ > 4 files changed, 78 insertions(+), 11 deletions(-) > Well what can I say guys. Just that I told you so. After 6 month you all agree on my original solution. Back than I wouldn't even dream of touching block layer and struct request. The 2 requests approach seemed perfectly fine. But people laughed that it is maybe good for out-of-tree debugging but I must do a proper support at the request layer. Even then, I say "ok but can we do it very ugly but safe on the side?, not to touch any legacy code." "No!" people say "do one for in one for out". So what should I understand for next time? a "Proper bidi support at the block layer" means we'll let you have a bad named pointer so you don't have to overload req->end_io_data at the scsi-ml level. Thanks I could use req->data for that. Tomo's code is a perfectly valid approach, but it is certainly not finished. I have some comments and questions below. If anyone cares I put up a patcheset that I've tested to be stable. it is here: http://www.bhalevy.com/open-osd/download/double_request_bidi/ 1. I'm not very experienced with blk_map_user but there is some asymmetry at SCSI-ml between handling of the first WRITE request and second READ request. At scsi_end_request() there is the end_that_request_chunk() calls. Now from my experience with kernel buffers (I am using sglists through scsi_req_map_sg()) not calling end_that_request_chunk() on the second request will leak the BIOs. Maybe blk_{,un}map_user() does not care I'm not sure. In my SCSI-ml patch I have hacked up a scsi_end_request_block() to free the BIOs unconditionally, since it is what I want to do. But calling end_that_request_chunk() with the right size should also do the trick. Also at scsi_release_buffers() you better NULL the pointers and check for not NULL as scsi_release_buffers() can be called multiple times now, and on half prepared commands, in the error handling paths. (See my patch: 0002-scsi-midlayer-bidirectional-commands.patch) 2. At struct request the blk_bidi_rq() should just be: #define blk_bidi_rq(rq) ((rq)->next_rq != NULL) The flag gives us nothing but an extra BUG_ON. In fact we don't need the flag at all. And the name next_rq? are we intending to have a chain of these? What's bad with plain bidi_read? 3. On the other hand we need a new flag REQ_IO_ONLY, to stick on the Second READ request. So block layer can avoid doing any extra work on IO_ONLY requests. For now it does not matter since BIDI devices like OSD do not get any regular FS traffic. But when they will, the Second READ request might confuse the elevator handling. (See: 0003-out-of-tree-block-bidi-debugging.patch for what I mean) Right now I'm not sure how to proceed. You can see my current patchset. if any of it you like it's there. If any of it you don't like I can change them. Any questions I will answer But I think I'll go crawl under a rock for a while before I make a fool of myself... oops is that to late?;) Boaz - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html