Re: BUG: unable to handle kernel NULL pointer dereference in transport_allocate_data_tasks+0x1af/0x2cc [target_core_mod]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2011-08-07 at 19:16 +0100, Chris Boot wrote:
> Hi all,
> 
> I've been using LIO for a little while, and thought I'd give what's 
> currently in linux-2.6.git (master) a shot. Unfortunately, I get a BUG 
> within seconds of an initiator connecting to one of my targets - this 
> isn't immediate but happens a couple of seconds in. The kernel log is below:
> 

Greetings again Chris,

Thank you for your detailed bug reported with the v4.1.0-rc1 code.  My
comments are below.

> [  109.599886] BUG: unable to handle kernel NULL pointer dereference at 
> 000000000000000c
> [  109.617486] IP: [<ffffffffa048ca5a>] 
> transport_allocate_data_tasks+0x1af/0x2cc [target_core_mod]
> [  109.636400] PGD 0
> [  109.647538] Oops: 0000 [#1] SMP
> [  109.660125] CPU 1
> [  109.662253] Modules linked in: iscsi_target_mod target_core_pscsi 
> target_core_file target_core_iblock target_core_mod configfs 
> ip6table_mangle iptable_mangle xt_DSCP xt_owner tun ip6table_filter 
> ip6_tables iptable_filter ip_tables cpufreq_stats cpufreq_userspace 
> cpufreq_conservative ebtable_nat ebtables x_tables cpufreq_powersave 
> ipmi_watchdog microcode nfsd nfs lockd fscache auth_rpcgss nfs_acl 
> sunrpc bridge stp ext4 jbd2 crc16 ipmi_si ipmi_devintf ipmi_msghandler 
> acpi_cpufreq mperf coretemp fuse crc32c_intel aesni_intel cryptd 
> aes_x86_64 aes_generic dm_crypt kvm_intel kvm snd_pcm snd_timer snd 
> soundcore snd_page_alloc processor evdev i2c_i801 ftdi_sio i2c_core 
> joydev usbserial video pcspkr button ext3 jbd mbcache dm_mod sg sd_mod 
> crc_t10dif usbhid hid ahci libahci 3w_9xxx libata ehci_hcd thermal fan 
> scsi_mod thermal_sys usbcore e1000e [last unloaded: scsi_wait_scan]
> [  109.674461]
> [  109.674464] Pid: 4703, comm: iscsi_trx Not tainted 3.0.0+ #1 Intel 
> Corporation S1200BTL/S1200BTL
> [  109.674468] RIP: 0010:[<ffffffffa048ca5a>]  [<ffffffffa048ca5a>] 
> transport_allocate_data_tasks+0x1af/0x2cc [target_core_mod]
> [  109.674478] RSP: 0018:ffff880187ccdcd0  EFLAGS: 00010283
> [  109.674480] RAX: 0000000000000000 RBX: ffff8801877dd280 RCX: 
> 0000000000000000
> [  109.674483] RDX: 0000000000000000 RSI: 0000000000000020 RDI: 
> ffff88022dbd23e0
> [  109.674485] RBP: ffff880230327c00 R08: 0000000000000000 R09: 
> 0000000000000040
> [  109.674487] R10: ffff88018769cc00 R11: ffff8801877dd040 R12: 
> 0000000000000000
> [  109.674490] R13: 0000000000000002 R14: ffff880189884400 R15: 
> 0000000000000100
> [  109.674493] FS:  0000000000000000(0000) GS:ffff880236820000(0000) 
> knlGS:0000000000000000
> [  109.674496] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  109.674498] CR2: 000000000000000c CR3: 0000000001604000 CR4: 
> 00000000000426e0
> [  109.674500] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> [  109.674503] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
> 0000000000000400
> [  109.674506] Process iscsi_trx (pid: 4703, threadinfo 
> ffff880187ccc000, task ffff8801edb571e0)
> [  109.674508] Stack:
> [  109.674509]  0000000000000000 ffff880200000000 ffff88022dbd23e0 
> 0000000000000100
> [  109.674513]  0000000100000200 000000000062ddf0 0000004000000020 
> ffff8801877dd40c
> [  109.674516]  0000004100000002 ffff8801877dd4a8 0000000000000000 
> ffff8801877dd280
> [  109.674520] Call Trace:
> [  109.674529]  [<ffffffffa048f7d8>] ? 
> transport_generic_new_cmd+0x190/0x2f6 [target_core_mod]
> [  109.674537]  [<ffffffffa04c413e>] ? iscsit_allocate_iovecs+0x24/0x4d 
> [iscsi_target_mod]
> [  109.674545]  [<ffffffffa048f9a5>] ? 
> transport_handle_cdb_direct+0x67/0x8e [target_core_mod]
> [  109.674552]  [<ffffffffa04c253b>] ? iscsit_sequence_cmd+0x5f/0xd8 
> [iscsi_target_mod]
> [  109.674558]  [<ffffffffa04c5b01>] ? 
> iscsi_target_rx_thread+0x7d2/0x1a71 [iscsi_target_mod]
> [  109.674564]  [<ffffffff8100d027>] ? load_TLS+0x7/0xa
> [  109.674567]  [<ffffffff8100d60f>] ? __switch_to+0xc1/0x20e
> [  109.674574]  [<ffffffffa04c532f>] ? 
> iscsit_thread_get_cpumask+0x88/0x88 [iscsi_target_mod]
> [  109.674578]  [<ffffffff8105e48d>] ? kthread+0x76/0x7e
> [  109.674582]  [<ffffffff81334a14>] ? kernel_thread_helper+0x4/0x10
> [  109.674585]  [<ffffffff8105e417>] ? kthread_worker_fn+0x139/0x139
> [  109.674588]  [<ffffffff81334a10>] ? gs_change+0x13/0x13
> [  109.674590] Code: 10 03 00 00 ff 90 c0 00 00 00 e9 1c 01 00 00 8b 75 
> 10 48 89 c7 e8 97 7f d1 e0 8b 55 40 4c 8b 45 08 c7 44 24 30 00 00 00 00 
> eb 45
> [  109.674603]  39 54 24 0c 77 47 4c 89 c7 4c 89 e6 b9 08 00 00 00 f3 a5 4c
> [  109.674610] RIP  [<ffffffffa048ca5a>] 
> transport_allocate_data_tasks+0x1af/0x2cc [target_core_mod]
> [  109.674618]  RSP <ffff880187ccdcd0>
> [  109.674620] CR2: 000000000000000c
> [  109.674662] ---[ end trace 263ff2d45aa35d71 ]---
> [  203.639102] iSCSI Login timeout on Network Portal 
> 2001:8b0:49:1:21e:67ff:fe14:69f4:3260
> 
> The crash happens here:
> 
> (gdb) list *transport_allocate_data_tasks+0x1af
> 0x11a5a is in transport_allocate_data_tasks 
> (drivers/target/target_core_transport.c:4142).
> 4137
> 4138                    task_size = task->task_size;
> 4139
> 4140                    /* Build new sgl, only up to task_size */
> 4141                    for_each_sg(task->task_sg, sg, 
> task->task_sg_nents, count) {
> 4142                            if (cmd_sg->length > task_size)
> 4143                                    break;
> 4144
> 4145                            *sg = *cmd_sg;
> 4146                            task_size -= cmd_sg->length;
> 
> Unfortunately I'm no kernel expert, so I can't debug any further than this.

It looks like we are running into some breakage with the recent v4.1 I/O
task generation code changes and a certain SCSI CDB size combination
being handled by iscsi-target (grover CC'ed).

As this new code is currently lacking debug output, the easiet way to
debug this with your current build would be making a wireshark packet
capture from the initiator side so we can determine which CDB is
actually trigger the bug here..

Proper pr_debug() out messages need to be added for these codepaths to
handle this types of cases w/o a recompile, but that is going to take a
little more time.  ;)

Also, the output from the following quick and dirty debug patch would be
helpful to pinpoint the issue as well.

Thanks for your bug report Chris!

--nab

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index e9311d0..38e4435 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -4102,6 +4102,9 @@ static int transport_allocate_data_tasks(
        WARN_ON(cmd->data_length % sector_size);
        sectors = DIV_ROUND_UP(cmd->data_length, sector_size);
        task_count = DIV_ROUND_UP_SECTOR_T(sectors, dev_max_sectors);
+
+       printk("cmd->data_length: %u, sectors: %llu, task_count: %d\n",
+                       cmd->data_length, sectors, task_count);
        
        cmd_sg = sgl;
        for (i = 0; i < task_count; i++) {
@@ -4115,6 +4118,9 @@ static int transport_allocate_data_tasks(
                task->task_lba = lba;
                task->task_sectors = min(sectors, dev_max_sectors);
                task->task_size = task->task_sectors * sector_size;
+               
+               printk("task[%d]: task_lba: %llu task_sectors: %u task_size: %u\n",
+                       i, task->task_lba, task->task_sectors, task->task_size);
 
                cdb = dev->transport->get_cdb(task);
                BUG_ON(!cdb);
@@ -4149,16 +4155,31 @@ static int transport_allocate_data_tasks(
 
                sg_init_table(task->task_sg, task->task_sg_nents);
 
+               printk("Setup task->task_sg: %p task->task_sg_nents: %d\n",
+                       task->task_sg, task->task_sg_nents);
+
                task_size = task->task_size;
 
                /* Build new sgl, only up to task_size */
                for_each_sg(task->task_sg, sg, task->task_sg_nents, count) {
+                       printk("sg[%d] Remaining task_size: %d cmd_sg: %p cmd_sg->length: %d\n",
+                               count, task_size, cmd_sg, cmd_sg->length);
+
                        if (cmd_sg->length > task_size)
                                break;
 
                        *sg = *cmd_sg;
                        task_size -= cmd_sg->length;
+                       if (task_size > 0x00ffffff) {
+                               printk("Bogus task_size: %d\n", task_size);
+                               return -EINVAL;
+                       }
+
                        cmd_sg = sg_next(cmd_sg);
+                       if (!cmd_sg || !cmd_sg->length) {
+                               printk("Bogus SGL at %p\n", cmd_sg);
+                               return -EINVAL;
+                       }
                }
 
                lba += task->task_sectors;


--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux