Re: [PATCH v2] add bidi support for block pc requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



FUJITA Tomonori wrote:
> Here is an updated version of the patch to add bidi support to block
> pc requests. Bugs spotted by Benny were fixed.
> 
> This patch can be applied cleanly to the scsi-misc git tree and is on
> the top of the following patch to add linked request support:
> 
> http://marc.info/?l=linux-scsi&m=117835587615642&w=2
> 
> ---
> From: FUJITA Tomonori <fujita.tomonori@xxxxxxxxxxxxx>
> Date: Mon, 7 May 2007 16:42:24 +0900
> Subject: [PATCH] add bidi support to scsi pc commands
> 
> This adds bidi support to block pc requests.
> 
> A bidi request uses req->next_rq pointer for an in request.
> 
> This patch introduces a new structure, scsi_data_buffer to hold the
> data buffer information. To avoid make scsi_cmnd structure fatter, the
> scsi mid-layer uses cmnd->request->next_rq->special pointer for
> a scsi_data_buffer structure. LLDs don't touch the second request
> (req->next_rq) so it's safe to use req->special.
> 
> scsi_blk_pc_done() always completes the whole command so
> scsi_end_request() simply completes the bidi chunk too.
> 
> A helper function, scsi_bidi_data_buffer() is for LLDs to access to
> the scsi_data_buffer structure easily.
> 
> Signed-off-by: FUJITA Tomonori <fujita.tomonori@xxxxxxxxxxxxx>
> ---
> @@ -685,6 +696,14 @@ static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int uptodate,
>  		}
>  	}
>  
> +	/*
> +	 * a REQ_BLOCK_PC command is always completed fully so just do
> +	 * end the bidi chunk.
> +	 */
> +	if (sdb)
> +		end_that_request_chunk(req->next_rq, uptodate,
> +				       sdb->request_bufflen);
> +

sdb->request_bufflen was zeroed in scsi_release_buffers which is called before
scsi_end_request.

>  static void scsi_blk_pc_done(struct scsi_cmnd *cmd)
>  {
>  	BUG_ON(!blk_pc_request(cmd->request));
> @@ -1090,10 +1159,22 @@ static void scsi_blk_pc_done(struct scsi_cmnd *cmd)
>  static int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req)
>  {
>  	struct scsi_cmnd *cmd;
> +	struct scsi_data_buffer *sdb = NULL;
> +
> +	if (blk_bidi_rq(req)) {
> +		sdb = kzalloc(sizeof(*sdb), GFP_ATOMIC);
> +		if (unlikely(!sdb))
> +			return BLKPREP_DEFER;
> +	}
>  
>  	cmd = scsi_get_cmd_from_req(sdev, req);
> -	if (unlikely(!cmd))
> +	if (unlikely(!cmd)) {
> +		kfree(sdb);
>  		return BLKPREP_DEFER;
> +	}
> +
> +	if (sdb)
> +		req->next_rq->special = sdb;
>  
>  	/*
>  	 * BLOCK_PC requests may transfer data, in which case they must

Before I get to my main concern here I have one comment. in scsi_get_cmd_from_req()
there is a code path in which a scsi_cmnd is taken from special and is not newly
allocated. It is best to move bidi allocation to scsi_get_cmd_from_req() and allocate
new sdb only if we get a new Command. (See my attached patch for an example)

OK! My main concern is with kzalloc(sizeof(*sdb), GFP_ATOMIC);
All IO allocations should come from slabs in case of a memory starved system.
There are 3 possible solutions I see:
1. Use struct scsi_cmnd for bidi_read and not a special scsi_data_buffer.
   Attached is above solution. It is by far the simplest of the three.
   So simple that even Jens's BigIO patch can almost apply at scsi_lib.c. (and vise versa)
   What's hanged on the request->next_rq->special is a second scsi_cmnd.
   The command is taken from regular command pool. This solution, though
   wasteful of some memory is very stable.

2. Force the users of BIDI to allocate scsi_data_buffer(s)
   Users will allocate a slab for scsi_data_buffers and hang them on nex_rq->special before
   hand. Than free them together with second request. This approach can be very stable, But
   it is bad because it is a layering violation. When block and SCSI layers decide to change
   the wiring of bidi. Users will have to change with them.

3. Let SCSI-ml manage a slab for scsi_data_buff's
   Have a flag to  scsi_setup_command_freelist() or a new API. When a device is flagged
   bidi-able we could 1-Allocate bigger slots to have extra memory for SDBs together
   with the command itself. or 2-allocate yet another slab for SDBs.

The 3rd approach is clean, and I can submit a patch for it. But I think it is not currently
necessary. The first solution (See attached patch) we can all live with, I hope. Since for
me it gives me the stability I need. And for the rest of the kernel it is the minimum
change, layered on existing building blocks.

If any one wants to try it out I put up the usual patchset that exercise this approach here.
http://www.bhalevy.com/open-osd/download/double_rq_bidi

Thanks
Boaz

>From ac8f0d3df711c5d01a269fde6a4ecce3000c3f68 Mon Sep 17 00:00:00 2001
From: Boaz Harrosh <bharrosh@xxxxxxxxxxx>
Date: Tue, 8 May 2007 14:04:46 +0300
Subject: [PATCH] scsi-ml bidi support

    At the block level bidi request uses req->next_rq pointer for a second
    bidi_read request.
    At Scsi-midlayer a second scsi_cmnd structure is used for the sglists
    of the bidi_read part. This command is put on request->next_rq->special.
    So struct scsi_cmnd is not changed. And scsi-ml minimally changes.

    - Define scsi_end_bidi_request() to do what scsi_end_request() does but for a
      bidi request. This is necessary because bidi commands are a bit tricky here.
      (See comments in body)

    - scsi_release_buffers() will also release the bidi_read buffers.
      I have changed the check from "if (cmd->use_sg)" to
      "if(cmd->request_buffer)" because it can now be called on half
      prepared commands. (and use_sg is no longer relevant here)

    - scsi_io_completion() will now call scsi_end_bidi_request on bidi commands
      and return.

    - The previous work done in scsi_init_io() is now done in a new
      scsi_init_data_buf() (which is 95% identical to old scsi_init_io())
      The new scsi_init_io() will call the above twice if needed also for
      the bidi_read command.

    - scsi_get_cmd_from_req() (called from scsi_setup_blk_pc_cmnd()) in the
      case that a new command is allocated, will also allocate a bidi_read
      command and hang it on cmd->request->next_rq->special. Only at this
      point is a command bidi.

    - scsi_setup_blk_pc_cmnd() will update sc_dma_direction to DMA_BIDIRECTIONAL
      which is not correct. It is only here for compatibility with iSCSI bidi
      driver. At final patch this will be a DMA_TO_DEVICE for this command and
      DMA_FROM_DEVICE for the bidi_read command. A driver must call scsi_bidi_cmnd()
      to work on bidi commands.

    - Define scsi_bidi_cmnd() to return true if it is a bidi command and a
      second command was allocated.

    - Define scsi_in()/scsi_out() to return the in or out commands from this command
      This API is to isolate users from the mechanics of bidi, and is also a short
      hand to common used code. (Even in scsi_lib.c)

    - In scsi_error.c at scsi_send_eh_cmnd() make sure bidi-lld is not confused by
      a get-sense command that looks like bidi. This is done by puting NULL at
      request->next_rq, and restoring before exit.
---
 drivers/scsi/scsi_error.c |    5 ++
 drivers/scsi/scsi_lib.c   |  133 ++++++++++++++++++++++++++++++++++++++++++---
 include/scsi/scsi_cmnd.h  |   19 ++++++-
 3 files changed, 148 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index e8350c5..169f4b4 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -620,6 +620,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd,
 	unsigned char old_cmd_len;
 	unsigned old_bufflen;
 	void *old_buffer;
+	struct request* old_next_rq;
 	int rtn;
 
 	/*
@@ -636,6 +637,9 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd,
 	old_cmd_len = scmd->cmd_len;
 	old_use_sg = scmd->use_sg;
 
+	old_next_rq = scmd->request->next_rq;
+	scmd->request->next_rq = NULL;
+
 	memset(scmd->cmnd, 0, sizeof(scmd->cmnd));
 	memcpy(scmd->cmnd, cmnd, cmnd_size);
 
@@ -734,6 +738,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd,
 	/*
 	 * Restore original data
 	 */
+	scmd->request->next_rq = old_next_rq;
 	scmd->request_buffer = old_buffer;
 	scmd->request_bufflen = old_bufflen;
 	memcpy(scmd->cmnd, old_cmnd, sizeof(scmd->cmnd));
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 61fbcdc..0f49195 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -85,6 +85,10 @@ static void scsi_unprep_request(struct request *req)
 	req->cmd_flags &= ~REQ_DONTPREP;
 	req->special = NULL;
 
+	if (scsi_bidi_cmnd(cmd)) {
+		scsi_put_command(scsi_in(cmd));
+		cmd->request->next_rq->special = NULL;
+	}
 	scsi_put_command(cmd);
 }
 
@@ -701,6 +705,52 @@ static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int uptodate,
 	return NULL;
 }
 
+/*
+ * Bidi commands Must be complete as a whole, both sides at once.
+ * If part of the bytes were written and lld returned
+ * scsi_in()->resid or scsi_out()->resid this information will be left
+ * in req->data_len and req->next_rq->data_len. The upper-layer driver can
+ * decide what to do with this information.
+ */
+void scsi_end_bidi_request(struct scsi_cmnd *cmd)
+{
+	struct scsi_cmnd* bidi_in = scsi_in(cmd);
+	request_queue_t *q = cmd->device->request_queue;
+	struct request *req = cmd->request;
+	unsigned long flags;
+
+	end_that_request_chunk(req, 1, req->data_len);
+	req->data_len = cmd->resid;
+	end_that_request_chunk(req->next_rq, 1, req->next_rq->data_len);
+	req->next_rq->data_len = bidi_in->resid;
+
+	req->next_rq->special = NULL;
+	scsi_put_command(bidi_in);
+
+	/*
+	 *FIXME: If ll_rw_blk.c is changed to also put_request(req->next_rq)
+	 *       in end_that_request_last() then this WARN_ON must be removed.
+	 *       Otherwise, upper-driver must have registered an end_io.
+	 */
+	WARN_ON(!req->end_io);
+
+	/* FIXME: the following code can be shared with scsi_end_request */
+	add_disk_randomness(req->rq_disk);
+
+	spin_lock_irqsave(q->queue_lock, flags);
+	if (blk_rq_tagged(req))
+		blk_queue_end_tag(q, req);
+
+	end_that_request_last(req, 1); /*allways good for now*/
+	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	/*
+	 * This will goose the queue request function at the end, so we don't
+	 * need to worry about launching another command.
+	 */
+	scsi_next_command(cmd);
+}
+
 struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask)
 {
 	struct scsi_host_sg_pool *sgp;
@@ -775,7 +825,7 @@ EXPORT_SYMBOL(scsi_free_sgtable);
  */
 static void scsi_release_buffers(struct scsi_cmnd *cmd)
 {
-	if (cmd->use_sg)
+	if (cmd->request_buffer)
 		scsi_free_sgtable(cmd->request_buffer, cmd->sglist_len);
 
 	/*
@@ -784,6 +834,15 @@ static void scsi_release_buffers(struct scsi_cmnd *cmd)
 	 */
 	cmd->request_buffer = NULL;
 	cmd->request_bufflen = 0;
+
+	if (scsi_bidi_cmnd(cmd)) {
+		struct scsi_cmnd* bidi_in = scsi_in(cmd);
+		if (bidi_in->request_buffer)
+			scsi_free_sgtable(bidi_in->request_buffer,
+			                           bidi_in->sglist_len);
+		bidi_in->request_buffer = NULL;
+		bidi_in->request_bufflen = 0;
+	}
 }
 
 /*
@@ -849,9 +908,15 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 				req->sense_len = len;
 			}
 		}
+		if (scsi_bidi_cmnd(cmd)) {
+			scsi_end_bidi_request(cmd);
+			return;
+		}
 		req->data_len = cmd->resid;
 	}
 
+	BUG_ON(blk_bidi_rq(req));
+
 	/*
 	 * Next deal with any sectors which we were able to correctly
 	 * handle.
@@ -978,17 +1043,18 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 EXPORT_SYMBOL(scsi_io_completion);
 
 /*
- * Function:    scsi_init_io()
+ * Function:    scsi_init_data_buf()
  *
- * Purpose:     SCSI I/O initialize function.
+ * Purpose:     SCSI I/O initialize helper.
+ *              maps the request buffers for the given cmd.
  *
- * Arguments:   cmd   - Command descriptor we wish to initialize
+ * Arguments:   cmd   - Command whos data we wish to initialize
  *
  * Returns:     0 on success
  *		BLKPREP_DEFER if the failure is retryable
  *		BLKPREP_KILL if the failure is fatal
  */
-static int scsi_init_io(struct scsi_cmnd *cmd)
+static int scsi_init_data_buff(struct scsi_cmnd *cmd)
 {
 	struct request     *req = cmd->request;
 	struct scatterlist *sgpnt;
@@ -1032,12 +1098,45 @@ static int scsi_init_io(struct scsi_cmnd *cmd)
 	printk(KERN_ERR "req nr_sec %lu, cur_nr_sec %u\n", req->nr_sectors,
 			req->current_nr_sectors);
 
-	/* release the command and kill it */
-	scsi_release_buffers(cmd);
-	scsi_put_command(cmd);
 	return BLKPREP_KILL;
 }
 
+/*
+ * Function:    scsi_init_io()
+ *
+ * Purpose:     SCSI I/O initialize function.
+ *
+ * Arguments:   cmd   - Command descriptor we wish to initialize
+ *
+ * Returns:     0 on success
+ *		BLKPREP_DEFER if the failure is retryable
+ *		BLKPREP_KILL if the failure is fatal
+ */
+static int scsi_init_io(struct scsi_cmnd *cmd)
+{
+	struct request *req = cmd->request;
+	int error;
+
+	error = scsi_init_data_buff(cmd);
+	if (error)
+		goto err_exit;
+
+	if (scsi_bidi_cmnd(cmd)) {
+		error = scsi_init_data_buff(scsi_in(cmd));
+		if (error)
+			goto err_exit;
+	}
+
+	req->buffer = NULL;
+	return 0;
+
+err_exit:
+	scsi_release_buffers(cmd);
+	if (error == BLKPREP_KILL)
+		scsi_unprep_request(req);
+	return error;
+}
+
 static int scsi_issue_flush_fn(request_queue_t *q, struct gendisk *disk,
 			       sector_t *error_sector)
 {
@@ -1064,6 +1163,22 @@ static struct scsi_cmnd *scsi_get_cmd_from_req(struct scsi_device *sdev,
 		if (unlikely(!cmd))
 			return NULL;
 		req->special = cmd;
+		if (blk_bidi_rq(req)) {
+			struct scsi_cmnd *bidi_in_cmd;
+			/*
+			 * second bidi request must be blk_pc_request for use of
+			 * correct sizes.
+			 */
+			WARN_ON(!blk_pc_request(req->next_rq));
+
+			bidi_in_cmd = scsi_get_command(sdev, GFP_ATOMIC);
+			if (unlikely(!bidi_in_cmd)) {
+				scsi_put_command(cmd);
+				return NULL;
+			}
+			req->next_rq->special = bidi_in_cmd;
+			bidi_in_cmd->request = req->next_rq;
+		}
 	} else {
 		cmd = req->special;
 	}
@@ -1124,6 +1239,8 @@ static int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req)
 	cmd->cmd_len = req->cmd_len;
 	if (!req->data_len)
 		cmd->sc_data_direction = DMA_NONE;
+	else if (blk_bidi_rq(req))
+		cmd->sc_data_direction = DMA_BIDIRECTIONAL;
 	else if (rq_data_dir(req) == WRITE)
 		cmd->sc_data_direction = DMA_TO_DEVICE;
 	else
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index a2e0c10..1223412 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -2,11 +2,11 @@
 #define _SCSI_SCSI_CMND_H
 
 #include <linux/dma-mapping.h>
+#include <linux/blkdev.h>
 #include <linux/list.h>
 #include <linux/types.h>
 #include <linux/timer.h>
 
-struct request;
 struct scatterlist;
 struct Scsi_Host;
 struct scsi_device;
@@ -135,4 +135,21 @@ extern void scsi_kunmap_atomic_sg(void *virt);
 extern struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *, gfp_t);
 extern void scsi_free_sgtable(struct scatterlist *, int);
 
+static inline int scsi_bidi_cmnd(struct scsi_cmnd *cmd)
+{
+	return blk_bidi_rq(cmd->request) &&
+	       (cmd->request->next_rq->special != NULL);
+}
+
+static inline struct scsi_cmnd *scsi_in(struct scsi_cmnd *cmd)
+{
+	return scsi_bidi_cmnd(cmd) ?
+	       cmd->request->next_rq->special : cmd;
+}
+
+static inline struct scsi_cmnd *scsi_out(struct scsi_cmnd *cmd)
+{
+	return cmd;
+}
+
 #endif /* _SCSI_SCSI_CMND_H */
-- 
1.5.0.4.402.g8035


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux