FUJITA Tomonori wrote: > Here is an updated version of the patch to add bidi support to block > pc requests. Bugs spotted by Benny were fixed. > > This patch can be applied cleanly to the scsi-misc git tree and is on > the top of the following patch to add linked request support: > > http://marc.info/?l=linux-scsi&m=117835587615642&w=2 > > --- > From: FUJITA Tomonori <fujita.tomonori@xxxxxxxxxxxxx> > Date: Mon, 7 May 2007 16:42:24 +0900 > Subject: [PATCH] add bidi support to scsi pc commands > > This adds bidi support to block pc requests. > > A bidi request uses req->next_rq pointer for an in request. > > This patch introduces a new structure, scsi_data_buffer to hold the > data buffer information. To avoid make scsi_cmnd structure fatter, the > scsi mid-layer uses cmnd->request->next_rq->special pointer for > a scsi_data_buffer structure. LLDs don't touch the second request > (req->next_rq) so it's safe to use req->special. > > scsi_blk_pc_done() always completes the whole command so > scsi_end_request() simply completes the bidi chunk too. > > A helper function, scsi_bidi_data_buffer() is for LLDs to access to > the scsi_data_buffer structure easily. > > Signed-off-by: FUJITA Tomonori <fujita.tomonori@xxxxxxxxxxxxx> > --- > @@ -685,6 +696,14 @@ static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int uptodate, > } > } > > + /* > + * a REQ_BLOCK_PC command is always completed fully so just do > + * end the bidi chunk. > + */ > + if (sdb) > + end_that_request_chunk(req->next_rq, uptodate, > + sdb->request_bufflen); > + sdb->request_bufflen was zeroed in scsi_release_buffers which is called before scsi_end_request. > static void scsi_blk_pc_done(struct scsi_cmnd *cmd) > { > BUG_ON(!blk_pc_request(cmd->request)); > @@ -1090,10 +1159,22 @@ static void scsi_blk_pc_done(struct scsi_cmnd *cmd) > static int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req) > { > struct scsi_cmnd *cmd; > + struct scsi_data_buffer *sdb = NULL; > + > + if (blk_bidi_rq(req)) { > + sdb = kzalloc(sizeof(*sdb), GFP_ATOMIC); > + if (unlikely(!sdb)) > + return BLKPREP_DEFER; > + } > > cmd = scsi_get_cmd_from_req(sdev, req); > - if (unlikely(!cmd)) > + if (unlikely(!cmd)) { > + kfree(sdb); > return BLKPREP_DEFER; > + } > + > + if (sdb) > + req->next_rq->special = sdb; > > /* > * BLOCK_PC requests may transfer data, in which case they must Before I get to my main concern here I have one comment. in scsi_get_cmd_from_req() there is a code path in which a scsi_cmnd is taken from special and is not newly allocated. It is best to move bidi allocation to scsi_get_cmd_from_req() and allocate new sdb only if we get a new Command. (See my attached patch for an example) OK! My main concern is with kzalloc(sizeof(*sdb), GFP_ATOMIC); All IO allocations should come from slabs in case of a memory starved system. There are 3 possible solutions I see: 1. Use struct scsi_cmnd for bidi_read and not a special scsi_data_buffer. Attached is above solution. It is by far the simplest of the three. So simple that even Jens's BigIO patch can almost apply at scsi_lib.c. (and vise versa) What's hanged on the request->next_rq->special is a second scsi_cmnd. The command is taken from regular command pool. This solution, though wasteful of some memory is very stable. 2. Force the users of BIDI to allocate scsi_data_buffer(s) Users will allocate a slab for scsi_data_buffers and hang them on nex_rq->special before hand. Than free them together with second request. This approach can be very stable, But it is bad because it is a layering violation. When block and SCSI layers decide to change the wiring of bidi. Users will have to change with them. 3. Let SCSI-ml manage a slab for scsi_data_buff's Have a flag to scsi_setup_command_freelist() or a new API. When a device is flagged bidi-able we could 1-Allocate bigger slots to have extra memory for SDBs together with the command itself. or 2-allocate yet another slab for SDBs. The 3rd approach is clean, and I can submit a patch for it. But I think it is not currently necessary. The first solution (See attached patch) we can all live with, I hope. Since for me it gives me the stability I need. And for the rest of the kernel it is the minimum change, layered on existing building blocks. If any one wants to try it out I put up the usual patchset that exercise this approach here. http://www.bhalevy.com/open-osd/download/double_rq_bidi Thanks Boaz
>From ac8f0d3df711c5d01a269fde6a4ecce3000c3f68 Mon Sep 17 00:00:00 2001 From: Boaz Harrosh <bharrosh@xxxxxxxxxxx> Date: Tue, 8 May 2007 14:04:46 +0300 Subject: [PATCH] scsi-ml bidi support At the block level bidi request uses req->next_rq pointer for a second bidi_read request. At Scsi-midlayer a second scsi_cmnd structure is used for the sglists of the bidi_read part. This command is put on request->next_rq->special. So struct scsi_cmnd is not changed. And scsi-ml minimally changes. - Define scsi_end_bidi_request() to do what scsi_end_request() does but for a bidi request. This is necessary because bidi commands are a bit tricky here. (See comments in body) - scsi_release_buffers() will also release the bidi_read buffers. I have changed the check from "if (cmd->use_sg)" to "if(cmd->request_buffer)" because it can now be called on half prepared commands. (and use_sg is no longer relevant here) - scsi_io_completion() will now call scsi_end_bidi_request on bidi commands and return. - The previous work done in scsi_init_io() is now done in a new scsi_init_data_buf() (which is 95% identical to old scsi_init_io()) The new scsi_init_io() will call the above twice if needed also for the bidi_read command. - scsi_get_cmd_from_req() (called from scsi_setup_blk_pc_cmnd()) in the case that a new command is allocated, will also allocate a bidi_read command and hang it on cmd->request->next_rq->special. Only at this point is a command bidi. - scsi_setup_blk_pc_cmnd() will update sc_dma_direction to DMA_BIDIRECTIONAL which is not correct. It is only here for compatibility with iSCSI bidi driver. At final patch this will be a DMA_TO_DEVICE for this command and DMA_FROM_DEVICE for the bidi_read command. A driver must call scsi_bidi_cmnd() to work on bidi commands. - Define scsi_bidi_cmnd() to return true if it is a bidi command and a second command was allocated. - Define scsi_in()/scsi_out() to return the in or out commands from this command This API is to isolate users from the mechanics of bidi, and is also a short hand to common used code. (Even in scsi_lib.c) - In scsi_error.c at scsi_send_eh_cmnd() make sure bidi-lld is not confused by a get-sense command that looks like bidi. This is done by puting NULL at request->next_rq, and restoring before exit. --- drivers/scsi/scsi_error.c | 5 ++ drivers/scsi/scsi_lib.c | 133 ++++++++++++++++++++++++++++++++++++++++++--- include/scsi/scsi_cmnd.h | 19 ++++++- 3 files changed, 148 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index e8350c5..169f4b4 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -620,6 +620,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, unsigned char old_cmd_len; unsigned old_bufflen; void *old_buffer; + struct request* old_next_rq; int rtn; /* @@ -636,6 +637,9 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, old_cmd_len = scmd->cmd_len; old_use_sg = scmd->use_sg; + old_next_rq = scmd->request->next_rq; + scmd->request->next_rq = NULL; + memset(scmd->cmnd, 0, sizeof(scmd->cmnd)); memcpy(scmd->cmnd, cmnd, cmnd_size); @@ -734,6 +738,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, /* * Restore original data */ + scmd->request->next_rq = old_next_rq; scmd->request_buffer = old_buffer; scmd->request_bufflen = old_bufflen; memcpy(scmd->cmnd, old_cmnd, sizeof(scmd->cmnd)); diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 61fbcdc..0f49195 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -85,6 +85,10 @@ static void scsi_unprep_request(struct request *req) req->cmd_flags &= ~REQ_DONTPREP; req->special = NULL; + if (scsi_bidi_cmnd(cmd)) { + scsi_put_command(scsi_in(cmd)); + cmd->request->next_rq->special = NULL; + } scsi_put_command(cmd); } @@ -701,6 +705,52 @@ static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int uptodate, return NULL; } +/* + * Bidi commands Must be complete as a whole, both sides at once. + * If part of the bytes were written and lld returned + * scsi_in()->resid or scsi_out()->resid this information will be left + * in req->data_len and req->next_rq->data_len. The upper-layer driver can + * decide what to do with this information. + */ +void scsi_end_bidi_request(struct scsi_cmnd *cmd) +{ + struct scsi_cmnd* bidi_in = scsi_in(cmd); + request_queue_t *q = cmd->device->request_queue; + struct request *req = cmd->request; + unsigned long flags; + + end_that_request_chunk(req, 1, req->data_len); + req->data_len = cmd->resid; + end_that_request_chunk(req->next_rq, 1, req->next_rq->data_len); + req->next_rq->data_len = bidi_in->resid; + + req->next_rq->special = NULL; + scsi_put_command(bidi_in); + + /* + *FIXME: If ll_rw_blk.c is changed to also put_request(req->next_rq) + * in end_that_request_last() then this WARN_ON must be removed. + * Otherwise, upper-driver must have registered an end_io. + */ + WARN_ON(!req->end_io); + + /* FIXME: the following code can be shared with scsi_end_request */ + add_disk_randomness(req->rq_disk); + + spin_lock_irqsave(q->queue_lock, flags); + if (blk_rq_tagged(req)) + blk_queue_end_tag(q, req); + + end_that_request_last(req, 1); /*allways good for now*/ + spin_unlock_irqrestore(q->queue_lock, flags); + + /* + * This will goose the queue request function at the end, so we don't + * need to worry about launching another command. + */ + scsi_next_command(cmd); +} + struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask) { struct scsi_host_sg_pool *sgp; @@ -775,7 +825,7 @@ EXPORT_SYMBOL(scsi_free_sgtable); */ static void scsi_release_buffers(struct scsi_cmnd *cmd) { - if (cmd->use_sg) + if (cmd->request_buffer) scsi_free_sgtable(cmd->request_buffer, cmd->sglist_len); /* @@ -784,6 +834,15 @@ static void scsi_release_buffers(struct scsi_cmnd *cmd) */ cmd->request_buffer = NULL; cmd->request_bufflen = 0; + + if (scsi_bidi_cmnd(cmd)) { + struct scsi_cmnd* bidi_in = scsi_in(cmd); + if (bidi_in->request_buffer) + scsi_free_sgtable(bidi_in->request_buffer, + bidi_in->sglist_len); + bidi_in->request_buffer = NULL; + bidi_in->request_bufflen = 0; + } } /* @@ -849,9 +908,15 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) req->sense_len = len; } } + if (scsi_bidi_cmnd(cmd)) { + scsi_end_bidi_request(cmd); + return; + } req->data_len = cmd->resid; } + BUG_ON(blk_bidi_rq(req)); + /* * Next deal with any sectors which we were able to correctly * handle. @@ -978,17 +1043,18 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) EXPORT_SYMBOL(scsi_io_completion); /* - * Function: scsi_init_io() + * Function: scsi_init_data_buf() * - * Purpose: SCSI I/O initialize function. + * Purpose: SCSI I/O initialize helper. + * maps the request buffers for the given cmd. * - * Arguments: cmd - Command descriptor we wish to initialize + * Arguments: cmd - Command whos data we wish to initialize * * Returns: 0 on success * BLKPREP_DEFER if the failure is retryable * BLKPREP_KILL if the failure is fatal */ -static int scsi_init_io(struct scsi_cmnd *cmd) +static int scsi_init_data_buff(struct scsi_cmnd *cmd) { struct request *req = cmd->request; struct scatterlist *sgpnt; @@ -1032,12 +1098,45 @@ static int scsi_init_io(struct scsi_cmnd *cmd) printk(KERN_ERR "req nr_sec %lu, cur_nr_sec %u\n", req->nr_sectors, req->current_nr_sectors); - /* release the command and kill it */ - scsi_release_buffers(cmd); - scsi_put_command(cmd); return BLKPREP_KILL; } +/* + * Function: scsi_init_io() + * + * Purpose: SCSI I/O initialize function. + * + * Arguments: cmd - Command descriptor we wish to initialize + * + * Returns: 0 on success + * BLKPREP_DEFER if the failure is retryable + * BLKPREP_KILL if the failure is fatal + */ +static int scsi_init_io(struct scsi_cmnd *cmd) +{ + struct request *req = cmd->request; + int error; + + error = scsi_init_data_buff(cmd); + if (error) + goto err_exit; + + if (scsi_bidi_cmnd(cmd)) { + error = scsi_init_data_buff(scsi_in(cmd)); + if (error) + goto err_exit; + } + + req->buffer = NULL; + return 0; + +err_exit: + scsi_release_buffers(cmd); + if (error == BLKPREP_KILL) + scsi_unprep_request(req); + return error; +} + static int scsi_issue_flush_fn(request_queue_t *q, struct gendisk *disk, sector_t *error_sector) { @@ -1064,6 +1163,22 @@ static struct scsi_cmnd *scsi_get_cmd_from_req(struct scsi_device *sdev, if (unlikely(!cmd)) return NULL; req->special = cmd; + if (blk_bidi_rq(req)) { + struct scsi_cmnd *bidi_in_cmd; + /* + * second bidi request must be blk_pc_request for use of + * correct sizes. + */ + WARN_ON(!blk_pc_request(req->next_rq)); + + bidi_in_cmd = scsi_get_command(sdev, GFP_ATOMIC); + if (unlikely(!bidi_in_cmd)) { + scsi_put_command(cmd); + return NULL; + } + req->next_rq->special = bidi_in_cmd; + bidi_in_cmd->request = req->next_rq; + } } else { cmd = req->special; } @@ -1124,6 +1239,8 @@ static int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req) cmd->cmd_len = req->cmd_len; if (!req->data_len) cmd->sc_data_direction = DMA_NONE; + else if (blk_bidi_rq(req)) + cmd->sc_data_direction = DMA_BIDIRECTIONAL; else if (rq_data_dir(req) == WRITE) cmd->sc_data_direction = DMA_TO_DEVICE; else diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h index a2e0c10..1223412 100644 --- a/include/scsi/scsi_cmnd.h +++ b/include/scsi/scsi_cmnd.h @@ -2,11 +2,11 @@ #define _SCSI_SCSI_CMND_H #include <linux/dma-mapping.h> +#include <linux/blkdev.h> #include <linux/list.h> #include <linux/types.h> #include <linux/timer.h> -struct request; struct scatterlist; struct Scsi_Host; struct scsi_device; @@ -135,4 +135,21 @@ extern void scsi_kunmap_atomic_sg(void *virt); extern struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *, gfp_t); extern void scsi_free_sgtable(struct scatterlist *, int); +static inline int scsi_bidi_cmnd(struct scsi_cmnd *cmd) +{ + return blk_bidi_rq(cmd->request) && + (cmd->request->next_rq->special != NULL); +} + +static inline struct scsi_cmnd *scsi_in(struct scsi_cmnd *cmd) +{ + return scsi_bidi_cmnd(cmd) ? + cmd->request->next_rq->special : cmd; +} + +static inline struct scsi_cmnd *scsi_out(struct scsi_cmnd *cmd) +{ + return cmd; +} + #endif /* _SCSI_SCSI_CMND_H */ -- 1.5.0.4.402.g8035