Yves-Alexis Perez wrote: > since kernel 4.11 (sorry it took so long to report) I have a box > failing to boot with a NULL pointer dereference (the box is stuck > there afterwards). I get the same result on a Quanta server with several 4.13 and 4.14 kernels (from the Ubuntu "mainline" and Xenial hwe-edge PPAs). This (I guess) problem had been reported by Stefan Priebe under "isci regression in 4.11.0-rc2 by scsi: libsas: allow async aborts" on 8 November, 2017[1]. That report didn't elicit any response here. > The bug has also been reported to the Debian BTS ([2]) and a > suggestion to revert 90965761 has been made. I can confirm it fix the > boot issue. The Debian people have implemented the suggestion to revert 90965761 as of their 4.14.12-1 kernel package[2]. > I don't have the complete stack trace at hand but there's an example > in the Debian bug. Here's a stack trace from my server. It was copied and pasted from a serial console (IPMI SOL), I hope it's complete. [ 9.184043] BUG: unable to handle kernel NULL pointer dereference at (null) [ 9.184055] IP: isci_task_abort_task+0x43/0x400 [isci] [ 9.184056] PGD 0 [ 9.184056] P4D 0 [ 9.184057] [ 9.184058] Oops: 0000 [#1] SMP [ 9.184060] Modules linked in: aesni_intel(+) aes_x86_64 crypto_simd glue_helper cryptd mei_me intel_cstate intel_rapl_perf mei shpchp lpc_ich ipmi_si(+) mac_hid kvm_intel kvm irqbypass ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipmi_devintf ipmi_msghandler autofs4 btrfs xor raid6_pq ast ttm drm_kms_helper ixgbe igb syscopyarea isci sysfillrect i2c_algo_bit dca sysimgblt libsas fb_sys_fops ptp mdio drm scsi_transport_sas pps_core wmi [ 9.184084] CPU: 18 PID: 434 Comm: kworker/u48:1 Not tainted 4.13.0-21-generic #24~16.04.1-Ubuntu [ 9.184084] Hardware name: Quanta S210-X12RS V2/S210-X12RS V2, BIOS S2RQ4A08 08/12/2013 [ 9.184090] Workqueue: scsi_tmf_0 scmd_eh_abort_handler [ 9.184091] task: ffff96507bb05d00 task.stack: ffffa2de87bb4000 [ 9.184095] RIP: 0010:isci_task_abort_task+0x43/0x400 [isci] [ 9.184095] RSP: 0018:ffffa2de87bb7c88 EFLAGS: 00010246 [ 9.184096] RAX: 0000000000000000 RBX: ffff9650782f11a8 RCX: 0000000000000000 [ 9.184097] RDX: 0000000000000000 RSI: ffff9650782f11a8 RDI: 0000000000000000 [ 9.184097] RBP: ffffa2de87bb7e28 R08: 0000000000000000 R09: 0000000000000001 [ 9.184098] R10: 000000000000b8cb R11: 00000000000002f3 R12: ffff9650782f1148 [ 9.184098] R13: ffff9650758cb800 R14: 0000000000000008 R15: 0000000000000000 [ 9.184099] FS: 0000000000000000(0000) GS:ffff9660bf380000(0000) knlGS:0000000000000000 [ 9.184100] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9.184100] CR2: 0000000000000000 CR3: 000000004b009000 CR4: 00000000001406e0 [ 9.184101] Call Trace: [ 9.184107] ? cpumask_next_and+0x31/0x50 [ 9.184110] ? load_balance+0x1b5/0x9c0 [ 9.184114] ? sched_clock+0x9/0x10 [ 9.184116] ? sched_clock+0x9/0x10 [ 9.184117] ? sched_clock+0x9/0x10 [ 9.184120] ? sched_clock_cpu+0x11/0xb0 [ 9.184121] ? pick_next_task_fair+0x3c7/0x560 [ 9.184123] ? __switch_to+0x211/0x510 [ 9.184125] ? put_prev_entity+0x27/0x100 [ 9.184129] sas_eh_abort_handler+0x30/0x50 [libsas] [ 9.184131] scmd_eh_abort_handler+0x74/0x230 [ 9.184135] process_one_work+0x156/0x410 [ 9.184136] worker_thread+0x4b/0x460 [ 9.184138] kthread+0x109/0x140 [ 9.184139] ? process_one_work+0x410/0x410 [ 9.184140] ? kthread_create_on_node+0x70/0x70 [ 9.184143] ret_from_fork+0x25/0x30 [ 9.184144] Code: 08 48 81 ec 78 01 00 00 c7 85 78 fe ff ff 00 00 00 00 c7 85 80 fe ff ff 00 00 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 <48> 8b 07 48 8b 40 30 48 8b 80 90 02 00 00 4c 8b a0 28 01 00 00 [ 9.184160] RIP: isci_task_abort_task+0x43/0x400 [isci] RSP: ffffa2de87bb7c88 [ 9.184161] CR2: 0000000000000000 [ 9.184162] ---[ end trace bf9920b58fca631f ]--- > The machine is a Dell Precision T5600 with the following SATA > controllers: > 00:1f.2 SATA controller: Intel Corporation C600/X79 series chipset 6-Port SATA > AHCI Controller (rev 05) > 05:00.0 Serial Attached SCSI controller: Intel Corporation C602 chipset 4-Port > SATA Storage Control Unit (rev 05) Mine is a Quanta S210-X12RS server with only one SATA controller: 08:00.0 Serial Attached SCSI controller: Intel Corporation C602 chipset 4-Port SATA Storage Control Unit (rev 05) Connected to that SATA controller are two Samsung 850 EVO 250GB SSDs and one 3TB WD Red disk. > If you need more information or need me to test something, please ask. Likewise. Best regards, -- Simon. [1] https://marc.info/?l=linux-scsi&m=151013394701914 [2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=882414