On Thu, Aug 19, 2021 at 9:53 PM Christoph Hellwig <hch@xxxxxx> wrote: > > On Thu, Aug 19, 2021 at 03:10:37PM +0200, Sven Schnelle wrote: > > Christoph Hellwig <hch@xxxxxx> writes: > > > > > On Thu, Aug 19, 2021 at 11:03:42AM +0200, Sven Schnelle wrote: > > >> I'm seeing a similar crash in our CI: > > > > > > This series: > > > > > > https://lore.kernel.org/linux-block/20210816131910.615153-1-hch@xxxxxx/T/#t > > > > > > should fi it. Can you give it a spin? > > > > I tested it without your patchset and it crashed around every second > > try. With that patchset, i wasn't able to reproduce it. > > Can you send a Tested-by: for the last patch which should fix this? > Hi Christoph I also met similar issue with blktests, I tried to apply the patchset but with no luck to apply them, any suggestions to fix it. [ 2464.154898] run blktests nvme/012 at 2021-08-20 21:20:29 [ 2464.192252] loop0: detected capacity change from 0 to 2097152 [ 2464.309275] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 [ 2464.396464] nvmet: creating controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:a43453b4c0df4cb7bd2303f547ca0f22. [ 2464.410162] nvme nvme0: creating 128 I/O queues. [ 2464.425839] nvme nvme0: new ctrl: "blktests-subsystem-1" [ 2465.483434] XFS (nvme0n1): Mounting V5 Filesystem [ 2465.493142] XFS (nvme0n1): Ending clean mount [ 2465.498278] xfs filesystem being mounted at /mnt/blktests supports timestamps until 2038 (0x7fffffff) [ 2488.544383] XFS (nvme0n1): Unmounting Filesystem [ 2488.559652] nvme nvme0: Removing ctrl: NQN "blktests-subsystem-1" [ 2488.625086] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000308 [ 2488.633871] Mem abort info: [ 2488.636655] ESR = 0x96000004 [ 2488.639698] EC = 0x25: DABT (current EL), IL = 32 bits [ 2488.645000] SET = 0, FnV = 0 [ 2488.648044] EA = 0, S1PTW = 0 [ 2488.651173] FSC = 0x04: level 0 translation fault [ 2488.656039] Data abort info: [ 2488.658908] ISV = 0, ISS = 0x00000004 [ 2488.662732] CM = 0, WnR = 0 [ 2488.665689] user pgtable: 4k pages, 48-bit VAs, pgdp=00000008fd3a0000 [ 2488.672119] [0000000000000308] pgd=0000000000000000, p4d=0000000000000000 [ 2488.678903] Internal error: Oops: 96000004 [#1] SMP [ 2488.683770] Modules linked in: nvme_loop nvme_fabrics nvmet nvme_core loop dm_log_writes dm_flakey rfkill mlx5_ib ib_uverbs ib_core sunrpc coresight_etm4x i2c_smbus coresight_tpiu coresight_replicator coresight_tmc joydev mlx5_core mlxfw psample tls acpi_ipmi ipmi_ssif ipmi_devintf ipmi_msghandler coresight_funnel coresight thunderx2_pmu vfat fat fuse zram ip_tables xfs crct10dif_ce ast ghash_ce i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm drm gpio_xlp i2c_xlp9xx uas usb_storage aes_neon_bs [last unloaded: nvme_core] [ 2488.735491] CPU: 41 PID: 0 Comm: swapper/41 Not tainted 5.14.0-rc5 #1 [ 2488.751647] pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--) [ 2488.757641] pc : latency_exceeded+0x30/0x304 [ 2488.761905] lr : wb_timer_fn+0x48/0x1fc [ 2488.765730] sp : ffff800012b83d00 [ 2488.769031] x29: ffff800012b83d00 x28: ffff800011dc7000 x27: ffff800012b83e80 [ 2488.776156] x26: ffff800011886000 x25: 00000000000000e0 x24: 0000000000000028 [ 2488.783280] x23: ffffffffffffffff x22: ffff0008097ac900 x21: ffff0008097ade80 [ 2488.790404] x20: 0000000000000000 x19: ffff00080adfb300 x18: 0000000000000000 [ 2488.797528] x17: ffff800f4ac9a000 x16: ffff800012b84000 x15: 0000000000004000 [ 2488.804652] x14: 0000000000000000 x13: 0000000000000038 x12: 0000000000000040 [ 2488.811776] x11: 0000000000036e09 x10: ffff0008097ade80 x9 : ffff80001073074c [ 2488.818901] x8 : fffffbffec7ec0b0 x7 : 000000000000003f x6 : 0000000000000000 [ 2488.826024] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 [ 2488.833148] x2 : 0000000000000000 x1 : ffff0008097ade80 x0 : 0000000000000000 [ 2488.840273] Call trace: [ 2488.842707] latency_exceeded+0x30/0x304 [ 2488.846618] wb_timer_fn+0x48/0x1fc [ 2488.850095] blk_stat_timer_fn+0x170/0x190 [ 2488.854183] call_timer_fn+0x3c/0x17c [ 2488.857835] __run_timers.part.0+0x290/0x330 [ 2488.862092] run_timer_softirq+0x48/0x80 [ 2488.866002] __do_softirq+0x128/0x380 [ 2488.869653] __irq_exit_rcu+0x154/0x160 [ 2488.873482] irq_exit+0x1c/0x30 [ 2488.876612] handle_domain_irq+0x70/0x9c [ 2488.880525] gic_handle_irq+0x58/0xd8 [ 2488.884174] call_on_irq_stack+0x2c/0x38 [ 2488.888086] do_interrupt_handler+0x5c/0x70 [ 2488.892257] el1_interrupt+0x30/0x50 [ 2488.895824] el1h_64_irq_handler+0x18/0x24 [ 2488.899908] el1h_64_irq+0x7c/0x80 [ 2488.903297] arch_cpu_idle+0x18/0x2c [ 2488.906861] default_idle_call+0x4c/0x160 [ 2488.910860] cpuidle_idle_call+0x14c/0x1a0 [ 2488.914947] do_idle+0xbc/0x110 [ 2488.918077] cpu_startup_entry+0x30/0x8c [ 2488.921988] secondary_start_kernel+0xec/0x110 [ 2488.926422] __secondary_switched+0x94/0x98 [ 2488.930596] Code: aa0103f5 f9403000 f9401674 f9404800 (f9418400) [ 2488.936789] ---[ end trace 8d092c5fdd268b3c ]--- [ 2488.941394] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 2488.948283] SMP: stopping secondary CPUs [ 2488.952240] Kernel Offset: disabled [ 2488.955715] CPU features: 0x00600051,a3200840 [ 2488.960059] Memory Limit: none [ 2488.963125] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- -- Best Regards, Yi Zhang