EMERG: SCSI driver issue with Fedora 40 - disk left bus - bus reset

Marius Schwarz <fedoradev@xxxxxxxxxxxx> · Fri, 15 Nov 2024 04:13:57 +0100

Hi,

Short:

there seems to be an issue with scsi drivers under high load in Fedora 
40, but it's not the kernel alone.
Long:

https://forum.proxmox.com/threads/fedora-40-vm-losses-disks-scsi-target2-0-0-target-reset-operation-started.157452/

Today, or better 6 hours ago, two Proxmox VMs ( kvm, qemu ) with SCSI 
simulated drives, got upgraded from F39 to F40.

Both VMs experienced a lot of these messages, anytime something io 
intensive was done, it took 4-10 minutes to recover:

Nov14 23:49] sd 2:0:0:0: [sda] tag#76 ABORT operation started
[  +4,898867] sd 2:0:0:0: ABORT operation timed-out.
[  +0,000008] sd 2:0:1:0: [sdb] tag#383 ABORT operation started
[  +4,914194] sd 2:0:1:0: ABORT operation timed-out.
[  +0,000014] sd 2:0:1:0: [sdb] tag#322 ABORT operation started
[  +4,916147] sd 2:0:1:0: ABORT operation timed-out.
[  +0,000007] sd 2:0:1:0: [sdb] tag#321 ABORT operation started
[  +4,914256] sd 2:0:1:0: ABORT operation timed-out.
[  +0,000009] sd 2:0:1:0: [sdb] tag#320 ABORT operation started
[  +4,915142] sd 2:0:1:0: ABORT operation timed-out.
[  +0,000006] sd 2:0:1:0: [sdb] tag#323 ABORT operation started
[  +4,915177] sd 2:0:1:0: ABORT operation timed-out.
[  +0,000007] sd 2:0:1:0: [sdb] tag#324 ABORT operation started
[  +4,915221] sd 2:0:1:0: ABORT operation timed-out.
[  +0,000007] sd 2:0:1:0: [sdb] tag#325 ABORT operation started
[Nov14 23:50] sd 2:0:1:0: ABORT operation timed-out.
[  +0,009594] scsi target2:0:0: TARGET RESET operation started
[  +4,905550] scsi target2:0:0: TARGET RESET operation timed-out.
[  +0,000005] scsi target2:0:1: TARGET RESET operation started
[ +34,407328] scsi target2:0:1: TARGET RESET operation timed-out.
[  +9,829432] sd 2:0:0:0: [sda] tag#76 ABORT operation started
[  +0,002336] sym0: SCSI BUS reset detected.
[  +0,002062] sd 2:0:0:0: ABORT operation complete.
[  +0,005062] sym0: SCSI BUS has been reset.
[  +3,193215] sd 2:0:1:0: Power-on or device reset occurred
[  +0,000057] sd 2:0:0:0: [sda] tag#76 BUS RESET operation started
[  +0,002319] sd 2:0:0:0: BUS RESET operation complete.
[  +0,000008] sym0: SCSI BUS reset detected.
[  +0,006945] sym0: SCSI BUS has been reset.
...
[ +10,058980] sd 2:0:0:0: Power-on or device reset occurred
[  +0,016222] sd 2:0:1:0: [sdb] tag#393 timing out command, waited 180s
[  +0,000005] sd 2:0:1:0: [sdb] tag#393 FAILED Result: 
hostbyte=DID_REQUEUE driverbyte=DRIVER_OK cmd_age=279s
[  +0,000002] sd 2:0:1:0: [sdb] tag#393 CDB: Write(10) 2a 00 00 00 a6 40 
00 02 00 00
[  +0,000001] I/O error, dev sdb, sector 42560 op 0x1:(WRITE) flags 
0x8800 phys_seg 64 prio class 2
[  +0,000003] I/O error, dev sdb, sector 42560 op 0x1:(WRITE) flags 
0x8800 phys_seg 64 prio class 2
[  +0,000002] sd 2:0:1:0: [sdb] tag#394 timing out command, waited 180s
[  +0,000002] sd 2:0:1:0: [sdb] tag#394 FAILED Result: 
hostbyte=DID_REQUEUE driverbyte=DRIVER_OK cmd_age=279s
[  +0,000001] sd 2:0:1:0: [sdb] tag#394 CDB: Write(10) 2a 00 00 00 a3 40 
00 03 00 00
[  +0,000000] I/O error, dev sdb, sector 41792 op 0x1:(WRITE) flags 
0xc800 phys_seg 96 prio class 2
[  +0,000001] I/O error, dev sdb, sector 41792 op 0x1:(WRITE) flags 
0xc800 phys_seg 96 prio class 2
[  +0,000002] sd 2:0:1:0: [sdb] tag#395 timing out command, waited 180s
[  +0,000001] sd 2:0:1:0: [sdb] tag#395 FAILED Result: 
hostbyte=DID_REQUEUE driverbyte=DRIVER_OK cmd_age=279s
[  +0,000001] sd 2:0:1:0: [sdb] tag#395 CDB: Write(10) 2a 00 00 00 a0 40 
00 03 00 00
[  +0,000000] I/O error, dev sdb, sector 41024 op 0x1:(WRITE) flags 
0xc800 phys_seg 96 prio class 2
[  +0,000001] I/O error, dev sdb, sector 41024 op 0x1:(WRITE) flags 
0xc800 phys_seg 96 prio class 2
[  +0,000013] sd 2:0:1:0: [sdb] tag#396 timing out command, waited 180s
[  +0,000002] sd 2:0:1:0: [sdb] tag#396 FAILED Result: 
hostbyte=DID_REQUEUE driverbyte=DRIVER_OK cmd_age=279s
[  +0,000000] sd 2:0:1:0: [sdb] tag#396 CDB: Write(10) 2a 00 00 00 9e 40 
00 02 00 00
[  +0,000001] I/O error, dev sdb, sector 40512 op 0x1:(WRITE) flags 
0x8800 phys_seg 64 prio class 2
[  +0,000000] I/O error, dev sdb, sector 40512 op 0x1:(WRITE) flags 
0x8800 phys_seg 64 prio class 2
[  +0,000001] sd 2:0:1:0: [sdb] tag#397 timing out command, waited 180s
[  +0,000002] sd 2:0:1:0: [sdb] tag#397 FAILED Result: 
hostbyte=DID_REQUEUE driverbyte=DRIVER_OK cmd_age=279s
[  +0,000000] sd 2:0:1:0: [sdb] tag#397 CDB: Write(10) 2a 00 00 00 9b 40 
00 03 00 00
[  +0,000001] I/O error, dev sdb, sector 39744 op 0x1:(WRITE) flags 
0xc800 phys_seg 96 prio class 2
[  +0,000001] I/O error, dev sdb, sector 39744 op 0x1:(WRITE) flags 
0xc800 phys_seg 96 prio class 2
[  +0,000001] sd 2:0:1:0: [sdb] tag#398 timing out command, waited 180s
[  +0,000001] sd 2:0:1:0: [sdb] tag#398 FAILED Result: 
hostbyte=DID_REQUEUE driverbyte=DRIVER_OK cmd_age=279s
[  +0,000000] sd 2:0:1:0: [sdb] tag#398 CDB: Write(10) 2a 00 00 00 98 40 
00 03 00 00
[  +0,005144] sd 2:0:1:0: Power-on or device reset occurred

and it ended in a KERNEL CRASH finally: (thats the old fc39 kernel 
crashing, the newest f40 one did it presumably also)

[  +0,000001]  static_key_enable_cpuslocked+0x63/0x90
[  +0,000004]  static_key_enable+0x1a/0x20
[  +0,000001]  process_one_work+0x173/0x340
[  +0,000003]  worker_thread+0x27b/0x3a0
[  +0,000002]  ? __pfx_worker_thread+0x10/0x10
[  +0,000001]  kthread+0xe5/0x120
[  +0,000002]  ? __pfx_kthread+0x10/0x10
[  +0,000001]  ret_from_fork+0x31/0x50
[  +0,000002]  ? __pfx_kthread+0x10/0x10
[  +0,000001]  ret_from_fork_asm+0x1b/0x30
[  +0,000003]  </TASK>
[ +26,879827] watchdog: BUG: soft lockup - CPU#1 stuck for 118s! 
[kworker/1:3:120797]
[  +0,000015] Modules linked in: xt_owner ipt_REJECT nf_reject_ipv4 
xt_connlimit nf_conncount nf_conntrack nf_defrag_ipv6 nf_defrag_ip
v4 ip6table_filter ip6_tables xt_multiport iptable_filter ip_tables 
cfg80211 rfkill sunrpc intel_rapl_msr intel_rapl_common intel_unco
re_frequency_common nfit libnvdimm kvm_intel kvm irqbypass i2c_piix4 
rapl virtio_balloon joydev fuse loop nfnetlink crct10dif_pclmul c
rc32_pclmul crc32c_intel polyval_clmulni bochs polyval_generic 
ghash_clmulni_intel sha512_ssse3 drm_vram_helper sha256_ssse3 drm_ttm_h
elper virtio_net ttm sym53c8xx sha1_ssse3 net_failover 
scsi_transport_spi failover serio_raw ata_generic pata_acpi qemu_fw_cfg
[  +0,000024] CPU: 1 PID: 120797 Comm: kworker/1:3 Tainted: 
G             L     6.8.11-200.fc39.x86_64 #1
[  +0,000002] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[  +0,000001] Workqueue: events netstamp_clear
[  +0,000006] RIP: 0010:smp_call_function_many_cond+0x12e/0x560
[  +0,000004] Code: 73 35 48 63 e8 49 8b 1f 48 81 fd 00 20 00 00 0f 83 
f6 03 00 00 48 03 1c ed a0 4c c0 bb 8b 53 08 48 89 de 83 e2 01
74 0a f3 90 <8b> 4e 08 83 e1 01 75 f6 83 c0 01 eb b1 48 83 c4 40 5b 5d 
41 5c 41
[  +0,000001] RSP: 0018:ff36e8b984593d60 EFLAGS: 00000202
[  +0,000001] RAX: 0000000000000000 RBX: ff26fb3f2483ce00 RCX: 
0000000000000001
[  +0,000001] RDX: 0000000000000001 RSI: ff26fb3f2483ce00 RDI: 
ff26fb3940156b98
[  +0,000000] RBP: 0000000000000000 R08: 0000000000000000 R09: 
0000000000000006
[  +0,000001] R10: 0000000000000002 R11: 0000000000000000 R12: 
ff26fb3f248b5ac0
[  +0,000000] R13: 0000000000000001 R14: 0000000000000001 R15: 
ff26fb3f248b5ac0
[  +0,000001] FS:  0000000000000000(0000) GS:ff26fb3f24880000(0000) 
knlGS:0000000000000000
[  +0,000001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0,000001] CR2: 000014bce6b69d60 CR3: 000000010d428003 CR4: 
0000000000771ef0
[  +0,000002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  +0,000001] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 
0000000000000400
[  +0,000000] PKRU: 55555554
[  +0,000001] Call Trace:
[  +0,000002]  <IRQ>
[  +0,000002]  ? watchdog_timer_fn+0x1e6/0x270
[  +0,000003]  ? __pfx_watchdog_timer_fn+0x10/0x10
[  +0,000002]  ? __hrtimer_run_queues+0x113/0x280
[  +0,000001]  ? ktime_get_update_offsets_now+0x49/0x110
[  +0,000004]  ? hrtimer_interrupt+0xf8/0x230
[  +0,000001]  ? __sysvec_apic_timer_interrupt+0x4d/0x140
[  +0,000003]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[  +0,000002]  </IRQ>
[  +0,000001]  <TASK>
[  +0,000000]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  +0,000004]  ? smp_call_function_many_cond+0x12e/0x560
[  +0,000001]  ? __pfx_do_sync_core+0x10/0x10
[  +0,000003]  on_each_cpu_cond_mask+0x24/0x40
[  +0,000001]  text_poke_bp_batch+0xbe/0x300
[  +0,000001]  ? arch_jump_label_transform_queue+0x57/0x80
[  +0,000003]  text_poke_finish+0x1f/0x30
[  +0,000002]  arch_jump_label_transform_apply+0x1a/0x30
[  +0,000002]  static_key_enable_cpuslocked+0x63/0x90
[  +0,000003]  static_key_enable+0x1a/0x20
[  +0,000001]  process_one_work+0x173/0x340
[  +0,000004]  worker_thread+0x27b/0x3a0
[  +0,000002]  ? __pfx_worker_thread+0x10/0x10
[  +0,000001]  kthread+0xe5/0x120
[  +0,000002]  ? __pfx_kthread+0x10/0x10
[  +0,000001]  ret_from_fork+0x31/0x50
[  +0,000002]  ? __pfx_kthread+0x10/0x10

These kind of scsi issues had reported and fixed in 2015. If you google 
the SCSI error, you will see what i mean.

As it's the SCSI subsystem I rebooted  with an old F39 kernel, which was 
running stable at the same VM for days.
The VM was running on Proxmox since January this year, without having 
such issues. No issues at all.

The old Kernel had the same issues as the new F40 kernel, so, whatever 
changed with F40, it is triggered by a change outside of the kernel.
It does not mean, it's not an kernel issues, but the trigger must be 
somewhere else. That's also the reason i write it to the list, instead 
of opening a kernel issue alone.

The Proxmox system was not changed for the last months, so we can rule 
it out as a cause.

An easy way to reproduce it, was to TAR the diskcontent and syphone it 
i.e. via SSH out, or, doing a mysql_upgrade on a 7 GB database, where it 
stopped working and causing trouble in step 3, but step 4+ was ok, so 
that we only have READ IO. It looks like, it's overwelming the drivers 
capabilities processing such amount of data. it also disturbes writing 
to disks as well.

While the bus was reseting, cpu load went up and over 500+ (6 cores) , 
the real cpu usage was ~ 1% and io wait was 99%-100% perma on all cores.

The storage is a 7 GB/s NVME on Proxmox side, so disk io is not an 
issue. The proxmox host did not record any issues with the drives, while 
this was happening inside the vms. It's happend soley inside the Fedora 
40 VM .

Attempted workaround:

The VMs disks got cloned with a SATA based driver and boot order was 
changed.
The SCSI drives are still connected, but not used.

Since this attempt, the system did not had an issue again nor logged 
something remotely similar.  In this timeframe, the SCSI based drivers 
would have had at least 3 major breakdowns.

It's now 4 in the morning here, so pls, do not expect a reaction soon, 
if you have questions or expect a kernel bugzilla entry.

best regards,
Marius Schwarz

--
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue