On Thu, 2010-11-11 at 14:57 -0800, Patil, Kiran wrote: > Yes, transport_generic_handle_data which is called from ft_recv_write_data can do msleep_interruptible only if transport is active. > > FYI, this msleep was not introduced by my patch, it has been there. > > Agree with Joe's both suggestion (fcoe_rcv - always let it go to processing thread and TCM should not block per CPU receive thread). Will let Nick comment on that. > Hey guys, So the split for interrupt context setup of individual se_cmd descriptors for TCM_Loop (and other WIP HW FC target mode drivers) is to use the optional target_core_fabric_ops->new_cmd_map() for the pieces of se_cmd setup logic that are currently not done in interrupt context. For TCM_Loop this is currently: *) transport_generic_allocate_tasks() (access of lun, PR and ALUA specifics locks currently using spin_lock() + spin_unlock() *) transport_generic_map_mem_to_cmd() using GFP_KERNEL allocations However for this specific transport_generic_handle_data() case: /* * Make sure that the transport has been disabled by * transport_write_pending() before readding this struct se_cmd to the * processing queue. If it has not yet been reset to zero by the * processing thread in transport_add_cmd_to_queue(), let other * processes run. If a signal was received, then we assume the * connection is being failed/shutdown, so we return a failure. */ while (atomic_read(&T_TASK(cmd)->t_transport_active)) { msleep_interruptible(10); if (signal_pending(current)) return -1; } is specific for existing drivers/target/lio-target iSCSI code, which need this for traditional kernel sockets recv side iSCSI WRITE case. Since we have already have FCP write data ready for submission to backend devices at this point, I think we want something in the transport_generic_new_cmd() -> transport_generic_write_pending() code that does the immediate SCSI write submission and skips the TFO->write_pending() callback / extra fabric API exchange/response.. Here is how TCM_loop is currently doing that with SCSI WRITE data mapped from incoming ->queuecommand() cmd->table.sgl memory: int tcm_loop_write_pending(struct se_cmd *se_cmd) { /* * Since Linux/SCSI has already sent down a struct scsi_cmnd * sc->sc_data_direction of DMA_TO_DEVICE with struct scatterlist array * memory, and memory has already been mapped to struct se_cmd->t_mem_list * format with transport_generic_map_mem_to_cmd(). * * We now tell TCM to add this WRITE CDB directly into the TCM storage * object execution queue. */ transport_generic_process_write(se_cmd); return 0; } This will skip the transport_check_aborted_status() in transport_generic_handle_data(), and immediately add the T_TASK(cmd)->t_task_list for se_task execution down to se_subsystem_api->do_task() and out to backend subsystem code. So just to reiterate the point with current v4.0 code, we currently cannot safely call transport_generic_allocate_tasks() or transport_generic_map_mem_to_cmd() from interrupt context, so you want to do these calls using TFO->new_cmd_map() callback in the backend kernel thread process context.. So I think this means you want to call transport_generic_process_write() to immediate queue the WRITE from TFO->write_pending(), but not very certain after looking at ft_write_pending(). Joe, any thoughts here..? Best, --nab > Thanks, > -- Kiran P. > > -----Original Message----- > From: devel-bounces@xxxxxxxxxxxxx [mailto:devel-bounces@xxxxxxxxxxxxx] On Behalf Of Joe Eykholt > Sent: Thursday, November 11, 2010 11:52 AM > To: Jansen, Frank > Cc: devel@xxxxxxxxxxxxx > Subject: Re: [Open-FCoE] transport_generic_handle_data - BUG: scheduling while atomic > > > > On 11/11/10 11:41 AM, Jansen, Frank wrote: > > Greetings! > > > > I'm running 2.6.36 with Kiran Patil's patches from 10/28/10. > > > > I have 4 logical volumes configured over fcoe: > > > > [root@dut ~]# tcm_node --listhbas > > \------> iblock_0 > > HBA Index: 1 plugin: iblock version: v4.0.0-rc5 > > \-------> r0_lun3 > > Status: ACTIVATED Execute/Left/Max Queue Depth: 0/32/32 > > SectorSize: 512 MaxSectors: 1024 > > iBlock device: dm-4 UDEV PATH: /dev/vg_R0_p1/lv_R0_p1_l3 > > Major: 253 Minor: 4 CLAIMED: IBLOCK > > udev_path: /dev/vg_R0_p1/lv_R0_p1_l3 > > \-------> r0_lun2 > > Status: ACTIVATED Execute/Left/Max Queue Depth: 0/32/32 > > SectorSize: 512 MaxSectors: 1024 > > iBlock device: dm-3 UDEV PATH: /dev/vg_R0_p1/lv_R0_p1_l2 > > Major: 253 Minor: 3 CLAIMED: IBLOCK > > udev_path: /dev/vg_R0_p1/lv_R0_p1_l2 > > \-------> r0_lun1 > > Status: ACTIVATED Execute/Left/Max Queue Depth: 0/32/32 > > SectorSize: 512 MaxSectors: 1024 > > iBlock device: dm-2 UDEV PATH: /dev/vg_R0_p1/lv_R0_p1_l1 > > Major: 253 Minor: 2 CLAIMED: IBLOCK > > udev_path: /dev/vg_R0_p1/lv_R0_p1_l1 > > \-------> r0_lun0 > > Status: ACTIVATED Execute/Left/Max Queue Depth: 0/32/32 > > SectorSize: 512 MaxSectors: 1024 > > iBlock device: dm-1 UDEV PATH: /dev/vg_R0_p1/lv_R0_p1_l0 > > Major: 253 Minor: 1 CLAIMED: IBLOCK > > udev_path: /dev/vg_R0_p1/lv_R0_p1_l0 > > > > When any significant I/O load is put on any of the devices, I receive > > a flood of the following messages: > > > >> Nov 11 13:46:09 dut kernel: BUG: scheduling while atomic: > >> LIO_iblock/4439/0x00000101 > >> Nov 11 13:46:09 dut kernel: Modules linked in: fcoe libfcoe > >> target_core_stgt target_core_pscsi target_core_file target_core_iblock > >> ipt_MASQUERADE iptable_nat nf_nat bridge stp llc autofs4 tcm_fc libfc > >> scsi_transport_fc scsi_tgt target_core_mod configfs sunrpc ipv6 > >> dm_mirror dm_region_hash dm_log kvm_intel kvm uinput ixgbe ioatdma > >> iTCO_wdt ses enclosure i2c_i801 i2c_core iTCO_vendor_support mdio sg > >> igb dca pcspkr evbug evdev ext4 mbcache jbd2 sd_mod crc_t10dif > >> pata_acpi ata_generic mpt2sas scsi_transport_sas ata_piix raid_class > >> dm_mod [last unloaded: speedstep_lib] > >> Nov 11 13:46:09 dut kernel: Pid: 4439, comm: LIO_iblock Not tainted > >> 2.6.36+ #1 > >> Nov 11 13:46:09 dut kernel: Call Trace: > >> Nov 11 13:46:09 dut kernel: <IRQ> [<ffffffff8104fb96>] > >> __schedule_bug+0x66/0x70 > >> Nov 11 13:46:09 dut kernel: [<ffffffff8149779c>] schedule+0xa2c/0xa60 > >> Nov 11 13:46:09 dut kernel: [<ffffffff81497d73>] > >> schedule_timeout+0x173/0x2e0 > >> Nov 11 13:46:09 dut kernel: [<ffffffff81071200>] ? > >> process_timeout+0x0/0x10 > >> Nov 11 13:46:09 dut kernel: [<ffffffff81497f3e>] > >> schedule_timeout_interruptible+0x1e/0x20 > >> Nov 11 13:46:09 dut kernel: [<ffffffff81072b39>] > >> msleep_interruptible+0x39/0x50 > >> Nov 11 13:46:09 dut kernel: [<ffffffffa033ebfa>] > >> transport_generic_handle_data+0x2a/0x80 [target_core_mod] > >> Nov 11 13:46:09 dut kernel: [<ffffffffa03c33ee>] > >> ft_recv_write_data+0x1fe/0x2b0 [tcm_fc] > >> Nov 11 13:46:09 dut kernel: [<ffffffffa03c13cb>] ft_recv_seq+0x8b/0xc0 > >> [tcm_fc] > >> Nov 11 13:46:09 dut kernel: [<ffffffffa03a0e1f>] > >> fc_exch_recv+0x61f/0xe20 [libfc] > >> Nov 11 13:46:09 dut kernel: [<ffffffff813c1123>] ? > >> skb_copy_bits+0x63/0x2c0 > >> Nov 11 13:46:09 dut kernel: [<ffffffff813c15ea>] ? > >> __pskb_pull_tail+0x26a/0x360 > >> Nov 11 13:46:09 dut kernel: [<ffffffffa015b86d>] > >> fcoe_recv_frame+0x18d/0x340 [fcoe] > >> Nov 11 13:46:09 dut kernel: [<ffffffff813c13df>] ? > >> __pskb_pull_tail+0x5f/0x360 > >> Nov 11 13:46:09 dut kernel: [<ffffffff813c0404>] ? > >> __netdev_alloc_skb+0x24/0x50 > >> Nov 11 13:46:09 dut kernel: [<ffffffffa015e52a>] fcoe_rcv+0x2aa/0x44c > >> [fcoe] > >> Nov 11 13:46:09 dut kernel: [<ffffffff8113c897>] ? > >> __kmalloc_node_track_caller+0x67/0xe0 > >> Nov 11 13:46:09 dut kernel: [<ffffffff813c0404>] ? > >> __netdev_alloc_skb+0x24/0x50 > >> Nov 11 13:46:09 dut kernel: [<ffffffff813cd39a>] > >> __netif_receive_skb+0x41a/0x5d0 > >> Nov 11 13:46:09 dut kernel: [<ffffffff81012699>] ? read_tsc+0x9/0x20 > >> Nov 11 13:46:09 dut kernel: [<ffffffff813ceab8>] > >> netif_receive_skb+0x58/0x80 > >> Nov 11 13:46:09 dut kernel: [<ffffffff813cec20>] > >> napi_skb_finish+0x50/0x70 > >> Nov 11 13:46:09 dut kernel: [<ffffffff813cf1a5>] > >> napi_gro_receive+0xc5/0xd0 > >> Nov 11 13:46:09 dut kernel: [<ffffffffa0207a1f>] > >> ixgbe_clean_rx_irq+0x31f/0x840 [ixgbe] > >> Nov 11 13:46:09 dut kernel: [<ffffffffa02083a6>] > >> ixgbe_clean_rxtx_many+0x136/0x240 [ixgbe] > >> Nov 11 13:46:09 dut kernel: [<ffffffff813cf382>] > >> net_rx_action+0x102/0x250 > >> Nov 11 13:46:09 dut kernel: [<ffffffff81068af2>] > >> __do_softirq+0xb2/0x240 > >> Nov 11 13:46:09 dut kernel: [<ffffffff8100c07c>] call_softirq+0x1c/0x30 > >> Nov 11 13:46:09 dut kernel: <EOI> [<ffffffff8100db25>] ? > >> do_softirq+0x65/0xa0 > >> Nov 11 13:46:09 dut kernel: [<ffffffff81068664>] > >> local_bh_enable+0x94/0xa0 > >> Nov 11 13:46:09 dut kernel: [<ffffffff813cdfd3>] > >> dev_queue_xmit+0x143/0x3b0 > >> Nov 11 13:46:09 dut kernel: [<ffffffffa015d96e>] fcoe_xmit+0x30e/0x520 > >> [fcoe] > >> Nov 11 13:46:09 dut kernel: [<ffffffffa03a2a13>] ? > >> _fc_frame_alloc+0x33/0x90 [libfc] > >> Nov 11 13:46:09 dut kernel: [<ffffffffa039f904>] fc_seq_send+0xb4/0x140 > >> [libfc] > >> Nov 11 13:46:09 dut kernel: [<ffffffffa03c1722>] > >> ft_write_pending+0x112/0x160 [tcm_fc] > >> Nov 11 13:46:09 dut kernel: [<ffffffffa0347800>] > >> transport_generic_new_cmd+0x280/0x2b0 [target_core_mod] > >> Nov 11 13:46:09 dut kernel: [<ffffffffa03479d4>] > >> transport_processing_thread+0x1a4/0x7c0 [target_core_mod] > >> Nov 11 13:46:09 dut kernel: [<ffffffff810835d0>] ? > >> autoremove_wake_function+0x0/0x40 > >> Nov 11 13:46:09 dut kernel: [<ffffffffa0347830>] ? > >> transport_processing_thread+0x0/0x7c0 [target_core_mod] > >> Nov 11 13:46:09 dut kernel: [<ffffffff81082f36>] kthread+0x96/0xa0 > >> Nov 11 13:46:09 dut kernel: [<ffffffff8100bf84>] > >> kernel_thread_helper+0x4/0x10 > >> Nov 11 13:46:09 dut kernel: [<ffffffff81082ea0>] ? kthread+0x0/0xa0 > >> Nov 11 13:46:09 dut kernel: [<ffffffff8100bf80>] ? > >> kernel_thread_helper+0x0/0x10 > > > > I started noticing these issues first when I ran I/O with larger > > filesizes (appr. 25GB), but I'm thinking that might be a red herring. > > I'll rebuild the kernel and tools to make sure nothing is out of sorts > > and will report on any additional findings. > > > > Thanks, > > > > Frank > > FCP data frames are coming in at the interrupt level, and TCM expects > to be called in a thread or non-interrupt context, since > transport_generic_handle_data() may sleep. > > A quick workaround would be to change the fast path in fcoe_rcv() so that > data always goes through the per-cpu receive threads. That avoids part of the > problem, but isn't anything like the right fix. It doesn't seem good to > let TCM block FCoE's per-cpu receive thread either. > > Here's a quick change if you want to just work around the problem. > I haven't tested it: > > diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c > index feddb53..8f854cd 100644 > --- a/drivers/scsi/fcoe/fcoe.c > +++ b/drivers/scsi/fcoe/fcoe.c > @@ -1285,6 +1285,7 @@ int fcoe_rcv(struct sk_buff *skb, struct net_device *netdev, > * BLOCK softirq context. > */ > if (fh->fh_type == FC_TYPE_FCP && > + 0 && > cpu == smp_processor_id() && > skb_queue_empty(&fps->fcoe_rx_list)) { > spin_unlock_bh(&fps->fcoe_rx_list.lock); > > --- > > Cheers, > Joe > > > > > _______________________________________________ > devel mailing list > devel@xxxxxxxxxxxxx > http://www.open-fcoe.org/mailman/listinfo/devel > _______________________________________________ > devel mailing list > devel@xxxxxxxxxxxxx > http://www.open-fcoe.org/mailman/listinfo/devel -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html