Re: FC target Errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It is apparent that we need a check on the transfer size. Crashing the kernel is not a good thing. It is apparent that the expected behaviour of the back end isn't going to cut it due to some problems on certain front end fabrics (fibre channel for instance). So, what Nicholas is proposing here seems to address the immediate issue of overrunning things and it looks like an improvement over the older fabric_max_sectors approach (across all LUNs) that was removed. Basing it on max sg entries seems like a good enhancement too. Unfortunately, the reasons for the removal of fabric_max_sectors remain. As Nicholas noted in commit046ba64, most initiators ignore the MAXIMUM TRANSFER LENGTH limit published. Also, many (most?) legacy devices expect a much larger transfer size to be handled. They don't check for a limit either. They just blast away and everything falls apart.

It is not just legacy stuff though. The one thing that seems constant in the data storage business is that data gets bigger. That brings us to what Roland and Arun have brought up. We are going to need a better solution in the long run. Expecting the hardware to handle this when other platforms deal with it in their driver stack isn't going to happen. That's a matter of economics I've never seen broken in this business. When performance drives larger cache sizes, we might see a change.

I have a current need for transfer sizes in the 30MB range. Fibre channel systems (with QLogic HBAs) were doing this 10-15 years ago. Imposing a limit of 4MB now is closing the barn door long after the horses are out. Incidentally, on legacy systems, large transfer sizes can make a huge difference in system throughput. I have no idea why since, at the sizes involved (4MB vs 30MB) , the small amount of protocol overhead added by breaking a transaction up into 7 or 10 pieces seems trivial. My customers have demonstrated this though. Their systems fail to operate within requirements when they reduce the transfer size. Fibre Channel has been the high-end performance leader for a long time and a lot of legacy systems pushed into Fibre Channel due to performance requirements were often tweaked and optimized even then.

So, how do I make this happen? Can someone (Nicholas? Roland? Arun?) point me to the areas of code where this should be worked? Arun, is there a way to get hardware documentation?

Craig Watson


On 07/08/2015 04:25 PM, Arun Easi wrote:
Hi Nick,

On Mon, 6 Jul 2015, 9:08pm -0700, Nicholas A. Bellinger wrote:

(Adding Giri + Andrew CC')

On Wed, 2015-07-01 at 15:50 -0700, Roland Dreier wrote:
On Wed, Jul 1, 2015 at 3:41 PM, Craig Watson
<craig.watson@xxxxxxxxxxxxxxxxxxx> wrote:
So, is it really a limitation in the hardware? .. or is it in the qla2xxx driver? Does the qla2xxx driver not support chained scatter/gather lists?

Since the removal of fabric_max_sectors et.al. how is this size limit
communicated to an initiator? From what I understood, the target front end is going to reflect the limit of the back end storage. Right now what I (and our customers) see is just a refusal to operate at large (> 4.9MB) transfer sizes. At some write sizes there aren't even any error indications on the target. The target just stops responding. This isn't a good thing. It's better than crashing but not by much. Other sizes do seem to crash the
kernel which really isn't a good thing.

The hardware really has the limitation that a single CTIO ("continue
target IO") control block can only have 1200-something gather-scatter
entries (which matches well with your 4.9 MB limit, given that each
entry will typically be a 4KB page).  However the fact that that
limits the size of a SCSI command is of course in driver code.  LIO
can use multiple CTIOs to respond to a single SCSI command. It
doesn't at the moment though.

There is no way this is communicated to the initiator that I know of
at the moment.  I did not understand the "remove IO size limit"
commits when they went in, but they definitely leave the target stack
in a broken state WRT this type of issue.


Ok, so qla2xxx really does need some manner of max_sectors enforced by
target-core due to it's hw sgl limits, but I think the original way of
using a backend attribute hard-coded across all LUNs was wrong.

I agree with the general idea here (at least short term).


How about a fabric driver provided field like the following instead..?

diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
index 33364b6..ce4e84f 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
@@ -1814,6 +1814,7 @@ static const struct target_core_fabric_ops tcm_qla2xxx_ops = {
    .module                = THIS_MODULE,
    .name                = "qla2xxx",
    .node_acl_size            = sizeof(struct tcm_qla2xxx_nacl),
+    .fabric_hw_max_sectors        = 8192,

As the limit is in number of SG entries, my preference would be to just say that, instead of in sectors terms, something like:
    .max_sg_entries            = XX,

Stack could internally convert that to max-sectors using page-size and sector-size calculation and report in EVPD-b0 and use for enforcement.

Longer term (Roland alluded to this, I think), it would be best to have stack call .queue_data_in/.write_pending with a max of "max_sg_entries" SG entries at one time and repeat until the whole data is transferred; thus achieving a possible unlimited data transfer length.

Regards,
-Arun

    .get_fabric_name        = tcm_qla2xxx_get_fabric_name,
    .tpg_get_wwn            = tcm_qla2xxx_get_fabric_wwn,
    .tpg_get_tag            = tcm_qla2xxx_get_tag,
diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index e318ddb..e730439 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -1081,6 +1081,15 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops)

    if (cmd->se_cmd_flags & SCF_SCSI_DATA_CDB) {
        unsigned long long end_lba;
+
+        if (cmd->se_tfo->fabric_hw_max_sectors &&
+            sectors > cmd->se_tfo->fabric_hw_max_sectors) {
+            printk_ratelimited(KERN_ERR "SCSI OP %02xh with too"
+                " big sectors %u exceeds fabric_hw_max_sectors:"
+                " %d\n", cdb[0], sectors,
+                cmd->se_tfo->fabric_hw_max_sectors);
+            return TCM_INVALID_CDB_FIELD;
+        }
check_lba:
        end_lba = dev->transport->get_blocks(dev) + 1;
        if (((cmd->t_task_lba + sectors) < cmd->t_task_lba) ||
diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c
index b5ba1ec..5675dc6 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -517,7 +517,8 @@ spc_emulate_evpd_b0(struct se_cmd *cmd, unsigned char *buf)
    /*
     * Set MAXIMUM TRANSFER LENGTH
     */
-    put_unaligned_be32(dev->dev_attrib.hw_max_sectors, &buf[8]);
+ put_unaligned_be32(min_not_zero(cmd->se_tfo->fabric_hw_max_sectors,
+               dev->dev_attrib.hw_max_sectors), &buf[8]);

    /*
     * Set OPTIMAL TRANSFER LENGTH
diff --git a/include/target/target_core_fabric.h b/include/target/target_core_fabric.h
index 18afef9..e47ef8b 100644
--- a/include/target/target_core_fabric.h
+++ b/include/target/target_core_fabric.h
@@ -5,6 +5,12 @@ struct target_core_fabric_ops {
    struct module *module;
    const char *name;
    size_t node_acl_size;
+    /*
+ * If the fabric is hardware limited by the number of sectors it can + * handle in a single I/O request, notify target-core to enforce this
+     * limit and report during EVPD=0xb0 MAXIMUM TRANSFER LENGTH.
+     */
+    u32 fabric_hw_max_sectors;
    char *(*get_fabric_name)(void);
    char *(*tpg_get_wwn)(struct se_portal_group *);
    u16 (*tpg_get_tag)(struct se_portal_group *);

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux