Re: FC target Errors

Craig Watson <craig.watson@xxxxxxxxxxxxxxxxxxx> · Thu, 09 Jul 2015 22:38:59 -0400

It is apparent that we need a check on the transfer size.  Crashing the 
kernel is not a good thing.  It is apparent that the expected behaviour 
of the back end isn't going to cut it due to some problems on certain 
front end fabrics (fibre channel for instance).  So, what Nicholas is 
proposing here seems to address the immediate issue of overrunning 
things and it looks like an improvement over the older 
fabric_max_sectors approach (across all LUNs) that was removed.  Basing 
it on max sg entries seems like a good enhancement too.  Unfortunately, 
the reasons for the removal of fabric_max_sectors remain.  As Nicholas 
noted in commit046ba64, most initiators ignore the MAXIMUM TRANSFER 
LENGTH limit published.  Also, many (most?) legacy devices expect a much 
larger transfer size to be handled.  They don't check for a limit 
either.  They just blast away and everything falls apart.

It is not just legacy stuff though.  The one thing that seems constant 
in the data storage business is that data gets bigger. That brings us to 
what Roland and Arun have brought up.  We are going to need a better 
solution in the long run.  Expecting the hardware to handle this when 
other platforms deal with it in their driver stack isn't going to 
happen.  That's a matter of economics I've never seen broken in this 
business.  When performance drives larger cache sizes, we might see a 
change.

I have a current need for transfer sizes in the 30MB range. Fibre 
channel systems (with QLogic HBAs) were doing this 10-15 years ago.  
Imposing a limit of 4MB now is closing the barn door long after the 
horses are out.  Incidentally, on legacy systems, large transfer sizes 
can make a huge difference in system throughput.  I have no idea why 
since, at the sizes involved (4MB vs 30MB) , the small amount of 
protocol overhead added by breaking a transaction up into 7 or 10 pieces 
seems trivial.  My customers have demonstrated this though.  Their 
systems fail to operate within requirements when they reduce the 
transfer size. Fibre Channel has been the high-end performance leader 
for a long time and a lot of legacy systems pushed into Fibre Channel 
due to performance requirements were often tweaked and optimized even then.

So, how do I make this happen?  Can someone (Nicholas? Roland? Arun?) 
point me to the areas of code where this should be worked?  Arun, is 
there a way to get hardware documentation?

Craig Watson


On 07/08/2015 04:25 PM, Arun Easi wrote:
Hi Nick,

On Mon, 6 Jul 2015, 9:08pm -0700, Nicholas A. Bellinger wrote:

(Adding Giri + Andrew CC')

On Wed, 2015-07-01 at 15:50 -0700, Roland Dreier wrote:
On Wed, Jul 1, 2015 at 3:41 PM, Craig Watson
<craig.watson@xxxxxxxxxxxxxxxxxxx> wrote:
So, is it really a limitation in the hardware? .. or is it in the 
qla2xxx
driver?  Does the qla2xxx driver not support chained scatter/gather 
lists?

Since the removal of fabric_max_sectors et.al. how is this size limit
communicated to an initiator?  From what I understood, the target 
front end
is going to reflect the limit of the back end storage. Right now 
what I
(and our customers) see is just a refusal to operate at large (> 
4.9MB)
transfer sizes.  At some write sizes there aren't even any error 
indications
on the target.  The target just stops responding.  This isn't a 
good thing.
It's better than crashing but not by much.  Other sizes do seem to 
crash the
kernel which really isn't a good thing.

The hardware really has the limitation that a single CTIO ("continue
target IO") control block can only have 1200-something gather-scatter
entries (which matches well with your 4.9 MB limit, given that each
entry will typically be a 4KB page).  However the fact that that
limits the size of a SCSI command is of course in driver code.  LIO
can use multiple CTIOs to respond to a single SCSI command. It
doesn't at the moment though.

There is no way this is communicated to the initiator that I know of
at the moment.  I did not understand the "remove IO size limit"
commits when they went in, but they definitely leave the target stack
in a broken state WRT this type of issue.


Ok, so qla2xxx really does need some manner of max_sectors enforced by
target-core due to it's hw sgl limits, but I think the original way of
using a backend attribute hard-coded across all LUNs was wrong.

I agree with the general idea here (at least short term).


How about a fabric driver provided field like the following instead..?

diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c 
b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
index 33364b6..ce4e84f 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
@@ -1814,6 +1814,7 @@ static const struct target_core_fabric_ops 
tcm_qla2xxx_ops = {
    .module                = THIS_MODULE,
    .name                = "qla2xxx",
    .node_acl_size            = sizeof(struct tcm_qla2xxx_nacl),
+    .fabric_hw_max_sectors        = 8192,

As the limit is in number of SG entries, my preference would be to 
just say that, instead of in sectors terms, something like:
    .max_sg_entries            = XX,

Stack could internally convert that to max-sectors using page-size and 
sector-size calculation and report in EVPD-b0 and use for enforcement.

Longer term (Roland alluded to this, I think), it would be best to 
have stack call .queue_data_in/.write_pending with a max of 
"max_sg_entries" SG entries at one time and repeat until the whole 
data is transferred; thus achieving a possible unlimited data transfer 
length.

Regards,
-Arun

    .get_fabric_name        = tcm_qla2xxx_get_fabric_name,
    .tpg_get_wwn            = tcm_qla2xxx_get_fabric_wwn,
    .tpg_get_tag            = tcm_qla2xxx_get_tag,
diff --git a/drivers/target/target_core_sbc.c 
b/drivers/target/target_core_sbc.c
index e318ddb..e730439 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -1081,6 +1081,15 @@ sbc_parse_cdb(struct se_cmd *cmd, struct 
sbc_ops *ops)

    if (cmd->se_cmd_flags & SCF_SCSI_DATA_CDB) {
        unsigned long long end_lba;
+
+        if (cmd->se_tfo->fabric_hw_max_sectors &&
+            sectors > cmd->se_tfo->fabric_hw_max_sectors) {
+            printk_ratelimited(KERN_ERR "SCSI OP %02xh with too"
+                " big sectors %u exceeds fabric_hw_max_sectors:"
+                " %d\n", cdb[0], sectors,
+                cmd->se_tfo->fabric_hw_max_sectors);
+            return TCM_INVALID_CDB_FIELD;
+        }
check_lba:
        end_lba = dev->transport->get_blocks(dev) + 1;
        if (((cmd->t_task_lba + sectors) < cmd->t_task_lba) ||
diff --git a/drivers/target/target_core_spc.c 
b/drivers/target/target_core_spc.c
index b5ba1ec..5675dc6 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -517,7 +517,8 @@ spc_emulate_evpd_b0(struct se_cmd *cmd, unsigned 
char *buf)
    /*
     * Set MAXIMUM TRANSFER LENGTH
     */
-    put_unaligned_be32(dev->dev_attrib.hw_max_sectors, &buf[8]);
+ put_unaligned_be32(min_not_zero(cmd->se_tfo->fabric_hw_max_sectors,
+               dev->dev_attrib.hw_max_sectors), &buf[8]);

    /*
     * Set OPTIMAL TRANSFER LENGTH
diff --git a/include/target/target_core_fabric.h 
b/include/target/target_core_fabric.h
index 18afef9..e47ef8b 100644
--- a/include/target/target_core_fabric.h
+++ b/include/target/target_core_fabric.h
@@ -5,6 +5,12 @@ struct target_core_fabric_ops {
    struct module *module;
    const char *name;
    size_t node_acl_size;
+    /*
+     * If the fabric is hardware limited by the number of sectors it 
can
+     * handle in a single I/O request, notify target-core to enforce 
this
+     * limit and report during EVPD=0xb0 MAXIMUM TRANSFER LENGTH.
+     */
+    u32 fabric_hw_max_sectors;
    char *(*get_fabric_name)(void);
    char *(*tpg_get_wwn)(struct se_portal_group *);
    u16 (*tpg_get_tag)(struct se_portal_group *);

--
To unsubscribe from this list: send the line "unsubscribe 
target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html