From: Jens Axboe <jens.axboe@xxxxxxxxxx> Subject: Re: [PATCH 4/4] bidi support: bidirectional request Date: Mon, 30 Apr 2007 13:11:57 +0200 > On Sun, Apr 29 2007, James Bottomley wrote: > > On Sun, 2007-04-29 at 18:48 +0300, Boaz Harrosh wrote: > > > FUJITA Tomonori wrote: > > > > From: Boaz Harrosh <bharrosh@xxxxxxxxxxx> > > > > Subject: [PATCH 4/4] bidi support: bidirectional request > > > > Date: Sun, 15 Apr 2007 20:33:28 +0300 > > > > > > > >> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > > > >> index 645d24b..16a02ee 100644 > > > >> --- a/include/linux/blkdev.h > > > >> +++ b/include/linux/blkdev.h > > > >> @@ -322,6 +322,7 @@ struct request { > > > >> void *end_io_data; > > > >> > > > >> struct request_io_part uni; > > > >> + struct request_io_part bidi_read; > > > >> }; > > > > > > > > Would be more straightforward to have: > > > > > > > > struct request_io_part in; > > > > struct request_io_part out; > > > > > > > > > > Yes I wish I could do that. For bidi supporting drivers this is the most logical. > > > But for the 99.9% of uni-directional drivers, calling rq_uni(), and being some what on > > > the hotish paths, this means we will need a pointer to a uni request_io_part. > > > This is bad because: > > > 1st- There is no defined stage in a request life where to definitely set that pointer, > > > specially in the preparation stages. > > > 2nd- hacks like scsi_error.c/scsi_send_eh_cmnd() will not work at all. Now this is a > > > very bad spot already, and I have a short term fix for it in the SCSI-bidi patches > > > (not sent yet) but a more long term solution is needed. Once such hacks are > > > cleaned up we can do what you say. This is exactly why I use the access functions > > > rq_uni/rq_io/rq_in/rq_out and not open code access. > > > > I'm still not really convinced about this approach. The primary job of > > the block layer is to manage and merge READ and WRITE requests. It > > serves a beautiful secondary function of queueing for arbitrary requests > > it doesn't understand (REQ_TYPE_BLOCK_PC or REQ_TYPE_SPECIAL ... or > > indeed any non REQ_TYPE_FS). > > > > bidirectional requests fall into the latter category (there's nothing > > really we can do to merge them ... they're just transported by the block > > layer). The only unusual feature is that they carry two bios. I think > > the drivers that actually support bidirectional will be a rarity, so it > > might even be advisable to add it to the queue capability (refuse > > bidirectional requests at the top rather than perturbing all the drivers > > to process them). > > > > So, what about REQ_TYPE_BIDIRECTIONAL rather than REQ_BIDI? That will > > remove it from the standard path and put it on the special command type > > path where we can process it specially. Additionally, if you take this > > approach, you can probably simply chain the second bio through > > req->special as an additional request in the stream. The only thing > > that would then need modification would be the dequeue of the block > > driver (it would have to dequeue both requests and prepare them) and > > that needs to be done only for drivers handling bidirectional requests. > > I agree, I'm really not crazy about shuffling the entire request setup > around just for something as exotic as bidirection commands. How about > just keeping it simple - have a second request linked off the first one > for the second data phase? So keep it completely seperate, not just > overload ->special for 2nd bio list. > > So basically just add a struct request pointer, so you can do rq = > rq->next_rq or something for the next data phase. I bet this would be a > LOT less invasive as well, and we can get by with a few helpers to > support it. This patch tried this approach. It's just for seeing how it works. I added bidi support to open-iscsi and bsg and tested this patch lightly. I've attached only a patch for the block layer and scsl-ml. You can get all the patches are: http://www.kernel.org/pub/linux/kernel/people/tomo/bidi If we go with this approach, we need just minor changes to the block layer. The overloading rq->special approach needs more but it's reasonable too. I need to add the proper error handling code, which might be a bit tricky, but I think that it will not be so complicated. > And it should definitely be a request type. I'm not sure about this. I think that bidi can't be a request type to trace bidi pc requests (we have bidi special requests like SMP). I use REQ_BIDI though I've not implemented bidi trace code. >From 7d278323ff8aad86fb82c823538f7ddfb6ded11c Mon Sep 17 00:00:00 2001 From: FUJITA Tomonori <fujita.tomonori@xxxxxxxxxxxxx> Date: Wed, 2 May 2007 03:55:56 +0900 Subject: [PATCH] add bidi support Signed-off-by: FUJITA Tomonori <fujita.tomonori@xxxxxxxxxxxxx> --- block/ll_rw_blk.c | 1 + drivers/scsi/scsi_lib.c | 72 +++++++++++++++++++++++++++++++++++++++------- include/linux/blkdev.h | 7 ++++ include/scsi/scsi_cmnd.h | 9 ++++++ 4 files changed, 78 insertions(+), 11 deletions(-) diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c index cf8752a..82842d6 100644 --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -256,6 +256,7 @@ static void rq_init(request_queue_t *q, rq->end_io = NULL; rq->end_io_data = NULL; rq->completion_data = NULL; + rq->next_rq = NULL; } /** diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index be8e655..96541cb 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -701,34 +701,36 @@ static struct scsi_cmnd *scsi_end_reques return NULL; } -struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask) +static struct scatterlist *do_scsi_alloc_sgtable(unsigned short use_sg, + unsigned short *sglist_len, + gfp_t gfp_mask) { struct scsi_host_sg_pool *sgp; struct scatterlist *sgl; - BUG_ON(!cmd->use_sg); + BUG_ON(!use_sg); - switch (cmd->use_sg) { + switch (use_sg) { case 1 ... 8: - cmd->sglist_len = 0; + *sglist_len = 0; break; case 9 ... 16: - cmd->sglist_len = 1; + *sglist_len = 1; break; case 17 ... 32: - cmd->sglist_len = 2; + *sglist_len = 2; break; #if (SCSI_MAX_PHYS_SEGMENTS > 32) case 33 ... 64: - cmd->sglist_len = 3; + *sglist_len = 3; break; #if (SCSI_MAX_PHYS_SEGMENTS > 64) case 65 ... 128: - cmd->sglist_len = 4; + *sglist_len = 4; break; #if (SCSI_MAX_PHYS_SEGMENTS > 128) case 129 ... 256: - cmd->sglist_len = 5; + *sglist_len = 5; break; #endif #endif @@ -737,11 +739,15 @@ #endif return NULL; } - sgp = scsi_sg_pools + cmd->sglist_len; + sgp = scsi_sg_pools + *sglist_len; sgl = mempool_alloc(sgp->pool, gfp_mask); return sgl; } +struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask) +{ + return do_scsi_alloc_sgtable(cmd->use_sg, &cmd->sglist_len, gfp_mask); +} EXPORT_SYMBOL(scsi_alloc_sgtable); void scsi_free_sgtable(struct scatterlist *sgl, int index) @@ -778,6 +784,9 @@ static void scsi_release_buffers(struct if (cmd->use_sg) scsi_free_sgtable(cmd->request_buffer, cmd->sglist_len); + if (cmd->ext_request_buffer.use_sg) + scsi_free_sgtable(cmd->ext_request_buffer.request_buffer, + cmd->ext_request_buffer.sglist_len); /* * Zero these out. They now point to freed memory, and it is * dangerous to hang onto the pointers. @@ -1106,9 +1115,48 @@ static int scsi_setup_blk_pc_cmnd(struct BUG_ON(!req->nr_phys_segments); + if (blk_bidi_rq(req)) { + BUG_ON(!req->next_rq); + + if (rq_data_dir(req) != WRITE || + rq_data_dir(req->next_rq) != READ) { + scsi_release_buffers(cmd); + scsi_put_command(cmd); + return BLKPREP_KILL; + } + } + ret = scsi_init_io(cmd); if (unlikely(ret)) return ret; + + if (blk_bidi_rq(req)) { + struct scsi_data_buffer *sdb = &cmd->ext_request_buffer; + struct scatterlist *sgpnt; + int count; + + sdb->use_sg = req->nr_phys_segments; + + sgpnt = do_scsi_alloc_sgtable(sdb->use_sg, + &sdb->sglist_len, + GFP_ATOMIC); + if (unlikely(!sgpnt)) { + scsi_release_buffers(cmd); + scsi_unprep_request(req); + return BLKPREP_DEFER; + } + + sdb->request_buffer = sgpnt; + sdb->request_bufflen = req->next_rq->data_len; + count = blk_rq_map_sg(req->q, req->next_rq, sgpnt); + if (unlikely(count > sdb->use_sg)) { + scsi_free_sgtable(sgpnt, sdb->sglist_len); + scsi_release_buffers(cmd); + scsi_put_command(cmd); + return BLKPREP_KILL; + } + sdb->use_sg = count; + } } else { BUG_ON(req->data_len); BUG_ON(req->data); @@ -1122,7 +1170,9 @@ static int scsi_setup_blk_pc_cmnd(struct BUILD_BUG_ON(sizeof(req->cmd) > sizeof(cmd->cmnd)); memcpy(cmd->cmnd, req->cmd, sizeof(cmd->cmnd)); cmd->cmd_len = req->cmd_len; - if (!req->data_len) + if (blk_bidi_rq(req)) + cmd->sc_data_direction = DMA_BIDIRECTIONAL; + else if (!req->data_len) cmd->sc_data_direction = DMA_NONE; else if (rq_data_dir(req) == WRITE) cmd->sc_data_direction = DMA_TO_DEVICE; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 83dcd8c..9d3bb4a 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -199,6 +199,7 @@ enum rq_flag_bits { __REQ_ALLOCED, /* request came from our alloc pool */ __REQ_RW_META, /* metadata io request */ __REQ_NR_BITS, /* stops here */ + __REQ_BIDI, /* bidirectional io request */ }; #define REQ_RW (1 << __REQ_RW) @@ -219,6 +220,7 @@ #define REQ_ORDERED_COLOR (1 << __REQ_OR #define REQ_RW_SYNC (1 << __REQ_RW_SYNC) #define REQ_ALLOCED (1 << __REQ_ALLOCED) #define REQ_RW_META (1 << __REQ_RW_META) +#define REQ_BIDI (1 << __REQ_BIDI) #define BLK_MAX_CDB 16 @@ -313,6 +315,9 @@ struct request { */ rq_end_io_fn *end_io; void *end_io_data; + + /* for bidi */ + struct request *next_rq; }; /* @@ -478,6 +483,7 @@ #define QUEUE_FLAG_DEAD 5 /* queue bein #define QUEUE_FLAG_REENTER 6 /* Re-entrancy avoidance */ #define QUEUE_FLAG_PLUGGED 7 /* queue is plugged */ #define QUEUE_FLAG_ELVSWITCH 8 /* don't use elevator, just do FIFO */ +#define QUEUE_FLAG_BIDI 9 /* bidirectional requests */ enum { /* @@ -542,6 +548,7 @@ #define blk_pm_request(rq) \ #define blk_sorted_rq(rq) ((rq)->cmd_flags & REQ_SORTED) #define blk_barrier_rq(rq) ((rq)->cmd_flags & REQ_HARDBARRIER) #define blk_fua_rq(rq) ((rq)->cmd_flags & REQ_FUA) +#define blk_bidi_rq(rq) ((rq)->cmd_flags & REQ_BIDI) #define list_entry_rq(ptr) list_entry((ptr), struct request, queuelist) diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h index a2e0c10..0e259e5 100644 --- a/include/scsi/scsi_cmnd.h +++ b/include/scsi/scsi_cmnd.h @@ -11,6 +11,12 @@ struct scatterlist; struct Scsi_Host; struct scsi_device; +struct scsi_data_buffer { + unsigned short use_sg; /* Number of pieces of scatter-gather */ + unsigned short sglist_len; /* size of malloc'd scatter-gather list */ + void *request_buffer; /* Actual requested buffer */ + unsigned request_bufflen; /* Actual request size */ +}; /* embedded in scsi_cmnd */ struct scsi_pointer { @@ -117,6 +123,9 @@ #define SCSI_SENSE_BUFFERSIZE 96 unsigned char tag; /* SCSI-II queued command tag */ unsigned long pid; /* Process ID, starts at 0. Unique per host. */ + + /* bidi in buffer */ + struct scsi_data_buffer ext_request_buffer; }; extern struct scsi_cmnd *scsi_get_command(struct scsi_device *, gfp_t); -- 1.4.3.2 - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html