https://bugzilla.kernel.org/show_bug.cgi?id=199435 Bug ID: 199435 Summary: HPSA + P420i resetting logical Direct-Access never complete Product: IO/Storage Version: 2.5 Kernel Version: 4.11.0-14-generic Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: SCSI Assignee: linux-scsi@xxxxxxxxxxxxxxx Reporter: anthonyhaussmann@xxxxxxxxx Regression: No I'm using the kernel 4.11.0-14-generic with the last hpsa driver compile from the last commit of torvalds github : https://github.com/torvalds/linux/commit/8b834bff1b73dce46f4e9f5e84af6f73fed8b0ef#diff-7a84fb366ebc08b575a832f0aeee3434 I'm using a Smart Array P420i, Firmware Version 8.32. When a resetting logical is triggered, this one never complete and the server start to have a heavy load (can rise to 3000). After the reset, some task begin to timout but I think that is just the effect of the resetting (cmaeventd is the process checking for controller status): Apr 18 01:28:53 kernel: hpsa 0000:08:00.0: scsi 0:1:0:0: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 Apr 18 01:29:16 kernel: INFO: task cmaeventd:3397 blocked for more than 120 seconds. Apr 18 01:29:16 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 18 01:29:16 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 01:29:16 kernel: cmaeventd D 0 3397 1 0x00000000 Apr 18 01:29:16 kernel: Call Trace: Apr 18 01:29:16 kernel: __schedule+0x3b9/0x8f0 Apr 18 01:29:16 kernel: schedule+0x36/0x80 Apr 18 01:29:16 kernel: scsi_block_when_processing_errors+0xd5/0x110 Apr 18 01:29:16 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 18 01:29:16 kernel: sg_open+0x14a/0x5c0 Apr 18 01:29:16 kernel: ? lookup_fast+0xd8/0x3b0 Apr 18 01:29:16 kernel: ? refcount_inc+0x9/0x40 Apr 18 01:29:16 kernel: chrdev_open+0xbf/0x1b0 Apr 18 01:29:16 kernel: do_dentry_open+0x208/0x310 Apr 18 01:29:16 kernel: ? cdev_put+0x30/0x30 Apr 18 01:29:16 kernel: vfs_open+0x4e/0x80 Apr 18 01:29:16 kernel: path_openat+0x2ac/0x1450 Apr 18 01:29:16 kernel: do_filp_open+0x99/0x110 Apr 18 01:29:16 kernel: ? __check_object_size+0x108/0x19e Apr 18 01:29:16 kernel: ? __alloc_fd+0x46/0x170 Apr 18 01:29:16 kernel: do_sys_open+0x12d/0x280 Apr 18 01:29:16 kernel: ? do_sys_open+0x12d/0x280 Apr 18 01:29:16 kernel: ? __put_cred+0x3d/0x50 Apr 18 01:29:16 kernel: ? SyS_access+0x1e8/0x230 Apr 18 01:29:16 kernel: SyS_open+0x1e/0x20 Apr 18 01:29:16 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 18 01:29:16 kernel: RIP: 0033:0x7f413c901be0 Apr 18 01:29:16 kernel: RSP: 002b:00007ffc0c1cd5b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000002 Apr 18 01:29:16 kernel: RAX: ffffffffffffffda RBX: 00000000025f7a40 RCX: 00007f413c901be0 Apr 18 01:29:16 kernel: RDX: 0000000000000008 RSI: 0000000000000002 RDI: 00007ffc0c1cd5f0 Apr 18 01:29:16 kernel: RBP: 0000000002563b40 R08: 0000000000000001 R09: 0000000000000000 Apr 18 01:29:16 kernel: R10: 00007f413c8ea760 R11: 0000000000000246 R12: 00007ffc0c1cd7b0 Apr 18 01:29:16 kernel: R13: 0000000000000001 R14: 00007ffc0c1cd700 R15: 00007ffc0c1cd830 Apr 18 01:29:16 kernel: INFO: task cmaidad:3442 blocked for more than 120 seconds. Apr 18 01:29:16 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 18 01:29:16 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 01:29:16 kernel: cmaidad D 0 3442 1 0x00000000 Apr 18 01:29:16 kernel: Call Trace: Apr 18 01:29:16 kernel: __schedule+0x3b9/0x8f0 Apr 18 01:29:16 kernel: schedule+0x36/0x80 Apr 18 01:29:16 kernel: scsi_block_when_processing_errors+0xd5/0x110 Apr 18 01:29:16 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 18 01:29:16 kernel: sg_open+0x14a/0x5c0 Apr 18 01:29:16 kernel: ? lookup_fast+0xd8/0x3b0 Apr 18 01:29:16 kernel: ? refcount_inc+0x9/0x40 Apr 18 01:29:16 kernel: chrdev_open+0xbf/0x1b0 Apr 18 01:29:16 kernel: do_dentry_open+0x208/0x310 Apr 18 01:29:16 kernel: ? cdev_put+0x30/0x30 Apr 18 01:29:16 kernel: vfs_open+0x4e/0x80 Apr 18 01:29:16 kernel: path_openat+0x2ac/0x1450 Apr 18 01:29:16 kernel: do_filp_open+0x99/0x110 Apr 18 01:29:16 kernel: ? ipcperms+0x94/0x100 Apr 18 01:29:16 kernel: ? __check_object_size+0x108/0x19e Apr 18 01:29:16 kernel: ? __alloc_fd+0x46/0x170 Apr 18 01:29:16 kernel: do_sys_open+0x12d/0x280 Apr 18 01:29:16 kernel: ? do_sys_open+0x12d/0x280 Apr 18 01:29:16 kernel: ? __put_cred+0x3d/0x50 Apr 18 01:29:16 kernel: ? SyS_access+0x1e8/0x230 Apr 18 01:29:16 kernel: SyS_open+0x1e/0x20 Apr 18 01:29:16 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 18 01:29:16 kernel: RIP: 0033:0x7ff5af4cdbe0 Apr 18 01:29:16 kernel: RSP: 002b:00007fff8eac8818 EFLAGS: 00000246 ORIG_RAX: 0000000000000002 Apr 18 01:29:16 kernel: RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007ff5af4cdbe0 Apr 18 01:29:16 kernel: RDX: 0000000000000008 RSI: 0000000000000002 RDI: 00007fff8eac8850 Apr 18 01:29:16 kernel: RBP: 0000000002372870 R08: 0000000000000001 R09: 00007ff5af4b77b8 Apr 18 01:29:16 kernel: R10: 00007ff5af4b6760 R11: 0000000000000246 R12: 0000000002372878 Apr 18 01:29:16 kernel: R13: 0000000000000005 R14: 00007ff5b00018c0 R15: 0000000000000000 Apr 18 01:29:16 kernel: INFO: task jbd2/sdam-8:9965 blocked for more than 120 seconds. Apr 18 01:29:16 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 18 01:29:16 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 01:29:16 kernel: jbd2/sdam-8 D 0 9965 2 0x00000000 Apr 18 01:29:16 kernel: Call Trace: Apr 18 01:29:16 kernel: __schedule+0x3b9/0x8f0 Apr 18 01:29:16 kernel: schedule+0x36/0x80 Apr 18 01:29:16 kernel: jbd2_journal_commit_transaction+0x241/0x1830 Apr 18 01:29:16 kernel: ? update_load_avg+0x84/0x560 Apr 18 01:29:16 kernel: ? update_load_avg+0x84/0x560 Apr 18 01:29:16 kernel: ? dequeue_entity+0xed/0x4c0 Apr 18 01:29:16 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 18 01:29:16 kernel: ? lock_timer_base+0x7d/0xa0 Apr 18 01:29:16 kernel: kjournald2+0xca/0x250 Apr 18 01:29:16 kernel: ? kjournald2+0xca/0x250 Apr 18 01:29:16 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 18 01:29:16 kernel: kthread+0x109/0x140 Apr 18 01:29:16 kernel: ? commit_timeout+0x10/0x10 Apr 18 01:29:16 kernel: ? kthread_create_on_node+0x70/0x70 Apr 18 01:29:16 kernel: ret_from_fork+0x25/0x30 The only way to be back to normal is to reboot the server. Hope this helps somebody. If there is any more info I can provide, just ask what would be useful. -- You are receiving this mail because: You are the assignee for the bug.