Re: tcm_fc crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2014-04-25 at 10:43 -0700, Jun Wu wrote:
> Hi Nicholas,
> 
> Sorry to respond to you late. I have collected the information you want.
> 
> Kernel version:
> root@poc1:~# uname -a
>  Linux poc1 3.11.0-18-generic #32-Ubuntu SMP Tue Feb 18 21:11:14 UTC
> 2014 x86_64 x86_64 x86_64 GNU/Linux
> 
> NIC:
> root@poc1:~# lspci | grep 82599
>  08:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit
> SFI/SFP+ Network Connection (rev 01)
> 

Thanks for the additional info.  Please also provide the specifics of
the FCoE initiator setup as well.

> Backstores:
> Here is the targetcli output of the target machine. It has 6 hard
> drives exported to 2 initiators.
> /> ls
> o- / ..................................................................... [...]
>   o- backstores .......................................................... [...]
>   | o- fileio ............................................... [0 Storage Object]
>   | o- iblock .............................................. [6 Storage Objects]
>   | | o- diskb ............................................ [/dev/sdb activated]
>   | | o- diskc ............................................ [/dev/sdc activated]
>   | | o- diskd ............................................ [/dev/sdd activated]
>   | | o- diske ............................................ [/dev/sde activated]
>   | | o- diskf ............................................ [/dev/sdf activated]
>   | | o- diskg ............................................ [/dev/sdg activated]
>   | o- pscsi ................................................ [0 Storage Object]
>   | o- rd_dr ................................................ [0 Storage Object]
>   | o- rd_mcp ............................................... [0 Storage Object]
>   o- ib_srpt ........................................................ [0 Target]
>   o- iscsi .......................................................... [0 Target]
>   o- loopback ....................................................... [0 Target]
>   o- qla2xxx ........................................................ [0 Target]
>   o- tcm_fc ......................................................... [1 Target]
>     o- 20:00:00:25:90:ef:03:ec ....................................... [enabled]
>       o- acls ......................................................... [2 ACLs]
>       | o- 20:00:00:25:90:ef:06:1e ............................. [6 Mapped LUNs]
>       | | o- mapped_lun0 ........................................... [lun0 (rw)]
>       | | o- mapped_lun1 ........................................... [lun1 (rw)]
>       | | o- mapped_lun2 ........................................... [lun2 (rw)]
>       | | o- mapped_lun3 ........................................... [lun3 (rw)]
>       | | o- mapped_lun4 ........................................... [lun4 (rw)]
>       | | o- mapped_lun5 ........................................... [lun5 (rw)]
>       | o- 20:00:00:25:90:ef:06:2a ............................. [6 Mapped LUNs]
>       |   o- mapped_lun0 ........................................... [lun0 (rw)]
>       |   o- mapped_lun1 ........................................... [lun1 (rw)]
>       |   o- mapped_lun2 ........................................... [lun2 (rw)]
>       |   o- mapped_lun3 ........................................... [lun3 (rw)]
>       |   o- mapped_lun4 ........................................... [lun4 (rw)]
>       |   o- mapped_lun5 ........................................... [lun5 (rw)]
>       o- luns ......................................................... [6 LUNs]
>         o- lun0 ...................................... [iblock/diskc (/dev/sdc)]
>         o- lun1 ...................................... [iblock/diskd (/dev/sdd)]
>         o- lun2 ...................................... [iblock/diske (/dev/sde)]
>         o- lun3 ...................................... [iblock/diskf (/dev/sdf)]
>         o- lun4 ...................................... [iblock/diskg (/dev/sdg)]
>         o- lun5 ...................................... [iblock/diskb (/dev/sdb)]
> 
> By compiling tcm_fc, we found the RIP (ft_queue_data_in+1386) points
> to tfc_io.c:94.
>  91         /*
>  92          * Setup to use first mem list entry, unless no data.
>  93          */
>  94         BUG_ON(remaining && !se_cmd->t_data_sg);
>  95         if (remaining) {
>  96                 sg = se_cmd->t_data_sg;
>  97                 mem_len = sg->length;
>  98                 mem_off = sg->offset;
>  99                 page = sg_page(sg);
> 100         }
> 
> That is BUG_ON(remaining && !se_cmd->t_data_sg).
> 

So let's find out a little more about the CDB that is triggering the
bug.

Please apply the following patch to your v3.11 tree to dump the se_cmd
in question when the bug is triggered in ft_queue_data_in():

diff --git a/drivers/target/tcm_fc/tfc_io.c b/drivers/target/tcm_fc/tfc_io.c
index e415af3..8009407 100644
--- a/drivers/target/tcm_fc/tfc_io.c
+++ b/drivers/target/tcm_fc/tfc_io.c
@@ -91,7 +91,13 @@ int ft_queue_data_in(struct se_cmd *se_cmd)
        /*
         * Setup to use first mem list entry, unless no data.
         */
-       BUG_ON(remaining && !se_cmd->t_data_sg);
+       if (remaining && !se_cmd->t_data_sg) {
+               printk("CDB: 0x%02x data_length: %u t_data_sg: %p t_data_nents: %u"
+                       "se_cmd_flags: 0x%08x\n", se_cmd->t_task_cdb[0],
+                       se_cmd->data_length, se_cmd->t_data_sg,
+                       se_cmd->t_data_nents, se_cmd->se_cmd_flags);
+               BUG();
+       }
        if (remaining) {
                sg = se_cmd->t_data_sg;
                mem_len = sg->length;


> root@poc1:~# modinfo tcm_fc
> filename:
> /lib/modules/3.11.0-18-generic/kernel/drivers/target/tcm_fc/tcm_fc.ko
> license:        GPL
> description:    FC TCM fabric driver 0.4
> srcversion:     68B468A9E0DB43CC9653984
> depends:        target_core_mod,libfc
> vermagic:       3.11.0-18-generic SMP mod_unload modversions
> parm:           debug_logging:a bit mask of logging levels (int)
> 
> On the 2 initiators, run fio to all the 6 hard drives on the target at
> the same time. The target crashes within a few seconds every time at
> the same RIP.
> 

So I don't see any tcm_fc specific changes in v3.11 code that would be
causing such a bug, nor any v3.11.y bugfixes in this area that would
apply.

Also since the bug is easy to reproduce with multiple initiators, it
might be worthwhile to try to reproduce with v3.14.y as well.

Thanks,

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux