Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At this point, I've turned this work over to Don; I presume he'll want to address the bitmask search. I'll leave it up to him to decide what to do about the refcount.


                Webb


On 3/26/15 11:10 AM, Tomas Henzl wrote:
On 03/26/2015 03:38 PM, Webb Scales wrote:
Ah!  Tomas, you are absolutely correct!  The loop should not be
restarting the search from the beginning of the bitfield, but rather
should be proceeding to check the next bit.  (Otherwise, there's no
point in having more than one bit!!)
Most of the time it will work as expected, my comment was about a corner case.
Are you going to repost this patch? Btw. I think that a local variable
to restart the loop with an incremented value is enough.

(This code has been tweaked so many times that when I read it now I no
longer see what it actually does....)

And, Tomas, you have a good point regarding the difference between
atomic_inc() and atomic_inc_return(), but, again, the difference is only
a couple of instructions in the context of a long code path, so I think
it's a difference without a distinction.  (And, I'm looking forward to
the day when all of the reference counting stuff can be removed...I
think it's not far off.)
If i got it right the refcount exist only because of the error handler,
i also think it might be rewritten so, that it could be removed.
This the atomic_inc/and atomic_inc_return is a small detail, but if you repost
the patch why not fix it too?

tomash


                  Webb



On 3/26/15 8:47 AM, Tomas Henzl wrote:
On 03/25/2015 07:33 PM, Webb Scales wrote:
Tomas,

You are correct that the previous approach of using find_next_zero_bit()
with a persistent offset is more run-time efficient in the worst case;
however, given that "normal" operation doesn't call this allocation
routine, and given that, when this routine is called, the bit mask being
searched only has about 16 bits in it, I opted for simplicity over
efficiency -- that is, I doubt that the difference in efficiency is
discernible, while getting rid of the last_allocation field is a
worthwhile savings in both memory use and code.
My comment is not about efficiency, I believe that when you measure it you wont be
able to measure any significant difference.
But if this '(unlikely(refcount > 1))' is true for let's say the first entry in the bitfield
this and any other command submitted later will not pass the cmd-alloc until the command that
caused '(unlikely(refcount > 1))' to be true is resolved. That might cause unexpected
behavior.

Regarding your earlier comment on the refcount variable, I believe that
it was handy for debugging purposes.  The code has undergone numerous
revisions, and the variable certainly could now be removed from the
source per your suggestion.  (Of course, the compiler is already
removing it, I'm sure. ;-) )
Not sure if the compiler is able to switch from 'atomic_inc_return' to 'atomic_inc' :) though.
It is not important. (I'd not comment on this without the other cmd_alloc inaccuracy)

tomash

                   Webb



-----Original Message-----
From: Tomas Henzl [mailto:thenzl@xxxxxxxxxx]
Sent: Monday, March 23, 2015 11:58 AM
To: Don Brace; Scott Teel; Kevin Barnett; james.bottomley@xxxxxxxxxxxxx; hch@xxxxxxxxxxxxx; Justin Lindley; brace
Cc: linux-scsi@xxxxxxxxxxxxxxx
Subject: Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation

On 03/17/2015 09:06 PM, Don Brace wrote:

From: Webb Scales <webbnh@xxxxxx>

Rework slave allocation:
     - separate the tagging support setup from the hostdata setup
     - make the hostdata setup act consistently when the lookup fails
     - make the hostdata setup act consistently when the device is not added
     - set up the queue depth consistently across these scenarios
     - if the block layer mq support is not available, explicitly enable and
       activate the SCSI layer tcq support (and do this at allocation-time so
       that the tags will be available for INQUIRY commands)

Tweak slave configuration so that devices which are masked are also
not attached.

Reviewed-by: Scott Teel <scott.teel@xxxxxxxx>
Reviewed-by: Kevin Barnett <kevin.barnett@xxxxxxxx>
Signed-off-by: Webb Scales <webbnh@xxxxxx>
Signed-off-by: Don Brace <don.brace@xxxxxxxx>
---
    drivers/scsi/hpsa.c |  153 +++++++++++++++++++++++++++++++++++++++++----------
    drivers/scsi/hpsa.h |    1
    2 files changed, 123 insertions(+), 31 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 34c178c..4e34a62 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -44,6 +44,7 @@
    #include <scsi/scsi_host.h>
    #include <scsi/scsi_tcq.h>
    #include <scsi/scsi_eh.h>
+#include <scsi/scsi_dbg.h>
    #include <linux/cciss_ioctl.h>
    #include <linux/string.h>
    #include <linux/bitmap.h>
@@ -212,6 +213,9 @@ static int hpsa_compat_ioctl(struct scsi_device *dev, int cmd,
static void cmd_free(struct ctlr_info *h, struct CommandList *c);
    static struct CommandList *cmd_alloc(struct ctlr_info *h);
+static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c);
+static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
+					    struct scsi_cmnd *scmd);
    static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
    	void *buff, size_t size, u16 page_code, unsigned char *scsi3addr,
    	int cmd_type);
@@ -2047,11 +2051,17 @@ static void hpsa_cmd_resolve_events(struct ctlr_info *h,
    	}
    }
+static void hpsa_cmd_resolve_and_free(struct ctlr_info *h,
+				      struct CommandList *c)
+{
+	hpsa_cmd_resolve_events(h, c);
+	cmd_tagged_free(h, c);
+}
+
    static void hpsa_cmd_free_and_done(struct ctlr_info *h,
    		struct CommandList *c, struct scsi_cmnd *cmd)
    {
-	hpsa_cmd_resolve_events(h, c);
-	cmd_free(h, c);
+	hpsa_cmd_resolve_and_free(h, c);
    	cmd->scsi_done(cmd);
    }
@@ -2072,8 +2082,7 @@ static void hpsa_cmd_abort_and_free(struct ctlr_info *h, struct CommandList *c,
    	hpsa_set_scsi_cmd_aborted(cmd);
    	dev_warn(&h->pdev->dev, "CDB %16phN was aborted with status 0x%x\n",
    			 c->Request.CDB, c->err_info->ScsiStatus);
-	hpsa_cmd_resolve_events(h, c);
-	cmd_free(h, c);		/* FIX-ME:  change to cmd_tagged_free(h, c) */
+	hpsa_cmd_resolve_and_free(h, c);
    }
static void process_ioaccel2_completion(struct ctlr_info *h,
@@ -4535,7 +4544,7 @@ static int hpsa_ciss_submit(struct ctlr_info *h,
    	}
if (hpsa_scatter_gather(h, c, cmd) < 0) { /* Fill SG list */
-		cmd_free(h, c);
+		hpsa_cmd_resolve_and_free(h, c);
    		return SCSI_MLQUEUE_HOST_BUSY;
    	}
    	enqueue_cmd_and_start_io(h, c);
@@ -4581,6 +4590,8 @@ static inline void hpsa_cmd_partial_init(struct ctlr_info *h, int index,
    {
    	dma_addr_t cmd_dma_handle = h->cmd_pool_dhandle + index * sizeof(*c);
+ BUG_ON(c->cmdindex != index);
+
    	memset(c->Request.CDB, 0, sizeof(c->Request.CDB));
    	memset(c->err_info, 0, sizeof(*c->err_info));
    	c->busaddr = (u32) cmd_dma_handle;
@@ -4675,27 +4686,24 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
/* Get the ptr to our adapter structure out of cmd->host. */
    	h = sdev_to_hba(cmd->device);
+
+	BUG_ON(cmd->request->tag < 0);
+
    	dev = cmd->device->hostdata;
    	if (!dev) {
    		cmd->result = DID_NO_CONNECT << 16;
    		cmd->scsi_done(cmd);
    		return 0;
    	}
-	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
- if (unlikely(lockup_detected(h))) {
-		cmd->result = DID_NO_CONNECT << 16;
-		cmd->scsi_done(cmd);
-		return 0;
-	}
-	c = cmd_alloc(h);
+	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
if (unlikely(lockup_detected(h))) {
    		cmd->result = DID_NO_CONNECT << 16;
-		cmd_free(h, c);
    		cmd->scsi_done(cmd);
    		return 0;
    	}
+	c = cmd_tagged_alloc(h, cmd);
/*
    	 * Call alternate submit routine for I/O accelerated commands.
@@ -4708,7 +4716,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
    		if (rc == 0)
    			return 0;
    		if (rc == SCSI_MLQUEUE_HOST_BUSY) {
-			cmd_free(h, c);
+			hpsa_cmd_resolve_and_free(h, c);
    			return SCSI_MLQUEUE_HOST_BUSY;
    		}
    	}
@@ -4822,15 +4830,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
    	sh->hostdata[0] = (unsigned long) h;
    	sh->irq = h->intr[h->intr_mode];
    	sh->unique_id = sh->irq;
+	error = scsi_init_shared_tag_map(sh, sh->can_queue);
+	if (error) {
+		dev_err(&h->pdev->dev,
+			"%s: scsi_init_shared_tag_map failed for controller %d\n",
+			__func__, h->ctlr);
+		goto fail_host_put;
+	}
    	error = scsi_add_host(sh, &h->pdev->dev);
-	if (error)
+	if (error) {
+		dev_err(&h->pdev->dev, "%s: scsi_add_host failed for controller %d\n",
+			__func__, h->ctlr);
    		goto fail_host_put;
+	}
    	scsi_scan_host(sh);
    	return 0;
fail_host_put:
-	dev_err(&h->pdev->dev, "%s: scsi_add_host"
-		" failed for controller %d\n", __func__, h->ctlr);
    	scsi_host_put(sh);
    	return error;
     fail:
@@ -4840,6 +4856,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
    }
/*
+ * The block layer has already gone to the trouble of picking out a unique,
+ * small-integer tag for this request.  We use an offset from that value as
+ * an index to select our command block.  (The offset allows us to reserve the
+ * low-numbered entries for our own uses.)
+ */
+static int hpsa_get_cmd_index(struct scsi_cmnd *scmd)
+{
+	int idx = scmd->request->tag;
+
+	if (idx < 0)
+		return idx;
+
+	/* Offset to leave space for internal cmds. */
+	return idx += HPSA_NRESERVED_CMDS;
+}
+
+/*
     * Send a TEST_UNIT_READY command to the specified LUN using the specified
     * reply queue; returns zero if the unit is ready, and non-zero otherwise.
     */
@@ -4979,18 +5012,18 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
    	/* if controller locked up, we can guarantee command won't complete */
    	if (lockup_detected(h)) {
    		dev_warn(&h->pdev->dev,
-			"scsi %d:%d:%d:%d RESET FAILED, lockup detected\n",
-			h->scsi_host->host_no, dev->bus, dev->target,
-			dev->lun);
+			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, lockup detected\n",
+			 h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
+			 hpsa_get_cmd_index(scsicmd));
    		return FAILED;
    	}
/* this reset request might be the result of a lockup; check */
    	if (detect_controller_lockup(h)) {
    		dev_warn(&h->pdev->dev,
-			 "scsi %d:%d:%d:%d RESET FAILED, new lockup detected\n",
+			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, new lockup detected\n",
    			 h->scsi_host->host_no, dev->bus, dev->target,
-			 dev->lun);
+			 dev->lun, hpsa_get_cmd_index(scsicmd));
    		return FAILED;
    	}
@@ -5442,6 +5475,59 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
    }
/*
+ * For operations with an associated SCSI command, a command block is allocated
+ * at init, and managed by cmd_tagged_alloc() and cmd_tagged_free() using the
+ * block request tag as an index into a table of entries.  cmd_tagged_free() is
+ * the complement, although cmd_free() may be called instead.
+ */
+static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
+					    struct scsi_cmnd *scmd)
+{
+	int idx = hpsa_get_cmd_index(scmd);
+	struct CommandList *c = h->cmd_pool + idx;
+	int refcount = 0;
+
+	if (idx < HPSA_NRESERVED_CMDS || idx >= h->nr_cmds) {
+		dev_err(&h->pdev->dev, "Bad block tag: %d not in [%d..%d]\n",
+			idx, HPSA_NRESERVED_CMDS, h->nr_cmds - 1);
+		/* The index value comes from the block layer, so if it's out of
+		 * bounds, it's probably not our bug.
+		 */
+		BUG();
+	}
+
+	refcount = atomic_inc_return(&c->refcount);
refcount is never used, use atomic_inc(&c->refcount); instead?

+	if (unlikely(!hpsa_is_cmd_idle(c))) {
+		/*
+		 * We expect that the SCSI layer will hand us a unique tag
+		 * value.  Thus, there should never be a collision here between
+		 * two requests...because if the selected command isn't idle
+		 * then someone is going to be very disappointed.
+		 */
+		dev_err(&h->pdev->dev,
+			"tag collision (tag=%d) in cmd_tagged_alloc().\n",
+			idx);
+		if (c->scsi_cmd != NULL)
+			scsi_print_command(c->scsi_cmd);
+		scsi_print_command(scmd);
+	}
+
+	hpsa_cmd_partial_init(h, idx, c);
+	return c;
+}
+
+static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c)
+{
+	/*
+	 * Release our reference to the block.  We don't need to do anything
+	 * else to free it, because it is accessed by index.  (There's no point
+	 * in checking the result of the decrement, since we cannot guarantee
+	 * that there isn't a concurrent abort which is also accessing it.)
+	 */
+	(void)atomic_dec(&c->refcount);
+}
+
+/*
     * For operations that cannot sleep, a command block is allocated at init,
     * and managed by cmd_alloc() and cmd_free() using a simple bitmap to track
     * which ones are free or in use.  Lock must be held when calling this.
@@ -5454,7 +5540,6 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
    {
    	struct CommandList *c;
    	int refcount, i;
-	unsigned long offset;
/*
    	 * There is some *extremely* small but non-zero chance that that
@@ -5466,31 +5551,39 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
    	 * very unlucky thread might be starved anyway, never able to
    	 * beat the other threads.  In reality, this happens so
    	 * infrequently as to be indistinguishable from never.
+	 *
+	 * Note that we start allocating commands before the SCSI host structure
+	 * is initialized.  Since the search starts at bit zero, this
+	 * all works, since we have at least one command structure available;
+	 * however, it means that the structures with the low indexes have to be
+	 * reserved for driver-initiated requests, while requests from the block
+	 * layer will use the higher indexes.
    	 */
- offset = h->last_allocation; /* benignly racy */
    	for (;;) {
-		i = find_next_zero_bit(h->cmd_pool_bits, h->nr_cmds, offset);
-		if (unlikely(i == h->nr_cmds)) {
-			offset = 0;
+		i = find_first_zero_bit(h->cmd_pool_bits, HPSA_NRESERVED_CMDS);
+		if (unlikely(i >= HPSA_NRESERVED_CMDS))
    			continue;
-		}
    		c = h->cmd_pool + i;
    		refcount = atomic_inc_return(&c->refcount);
    		if (unlikely(refcount > 1)) {
    			cmd_free(h, c); /* already in use */
-			offset = (i + 1) % h->nr_cmds;
Hi Don,
when this happens - a command has its bitfield flag cleared and, but it's taken - refcount is > 1
it will be so likely for next several thousands of tests in this function until the it is freed.
When it is the first bit in the bitfield it will block all following commands sent to the card for that time.
The previous variant  'find_next_zero_bit + offset = (i + 1) % h->nr_cmds' seems to handle this better.
Cheers, Tomas

    			continue;
    		}
    		set_bit(i & (BITS_PER_LONG - 1),
    			h->cmd_pool_bits + (i / BITS_PER_LONG));
    		break; /* it's ours now. */
    	}
-	h->last_allocation = i; /* benignly racy */
    	hpsa_cmd_partial_init(h, i, c);
    	return c;
    }
+/*
+ * This is the complementary operation to cmd_alloc().  Note, however, in some
+ * corner cases it may also be used to free blocks allocated by
+ * cmd_tagged_alloc() in which case the ref-count decrement does the trick and
+ * the clear-bit is harmless.
+ */
    static void cmd_free(struct ctlr_info *h, struct CommandList *c)
    {
    	if (atomic_dec_and_test(&c->refcount)) {
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 3ec8934..2536b67 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -141,7 +141,6 @@ struct ctlr_info {
    	struct CfgTable __iomem *cfgtable;
    	int	interrupts_enabled;
    	int 	max_commands;
-	int last_allocation;
    	atomic_t commands_outstanding;
    #	define PERF_MODE_INT	0
    #	define DOORBELL_INT	1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux