On Thu, Oct 26, 2023 at 05:56:03PM +0900, Damien Le Moal wrote: > On 2023/10/26 17:25, Bagas Sanjaya wrote: > > Hi, > > > > I notice a bug report on Bugzilla [1]. Quoting from it: > > [...] > > >> [ 437.249448] PM: suspend entry (deep) > >> [ 437.255308] Filesystems sync: 0.005 seconds > >> [ 437.255570] Freezing user space processes > >> [ 437.257093] Freezing user space processes completed (elapsed 0.001 seconds) > >> [ 437.257097] OOM killer disabled. > >> [ 437.257098] Freezing remaining freezable tasks > >> [ 437.258226] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) > >> [ 437.258281] printk: Suspending console(s) (use no_console_suspend to debug) > >> [ 437.291778] sd 0:0:0:0: [sdb] Synchronizing SCSI cache > >> [ 437.291825] sd 0:0:1:0: [sdc] Synchronizing SCSI cache > >> [ 437.292083] sd 0:0:0:0: [sdb] Stopping disk > >> [ 437.292083] sd 0:0:1:0: [sdc] Stopping disk > >> [ 438.363660] sd 1:0:0:0: [sda] Synchronizing SCSI cache > >> [ 438.363760] sd 1:0:0:0: [sda] Stopping disk > > Given this message, this does not look like the latest kernel. > > >> [ 589.081341] drivers/scsi/mvsas/mv_sas.c 1304:mvs_I_T_nexus_reset for device[1]:rc= 0 > >> [ 610.481270] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > >> [ 610.481280] rcu: 11-...0: (0 ticks this GP) idle=4f84/1/0x4000000000000000 softirq=19873/19873 fqs=1159 > >> [ 610.481292] (detected by 5, t=5252 jiffies, g=53581, q=31630 ncpus=12) > >> [ 610.481299] Sending NMI from CPU 5 to CPUs 11: > >> [ 610.481309] NMI backtrace for cpu 11 > >> [ 610.481312] CPU: 11 PID: 3152 Comm: kworker/u32:59 Tainted: G I 6.1.57-vanilla #14 > >> [ 610.481318] Hardware name: System manufacturer System Product Name/P6T WS PRO, BIOS 1205 09/24/2010 > >> [ 610.481321] Workqueue: events_unbound async_run_entry_fn > >> [ 610.481329] RIP: 0010:mvs_int_rx+0x81/0x150 [mvsas] > >> [ 610.481346] Code: 00 00 44 39 75 70 74 47 48 8b 45 60 45 89 e6 41 81 e6 ff 03 00 00 41 8d 56 01 8b 1c 90 49 89 d4 41 89 df 41 81 e7 00 00 08 00 <f7> c3 00 00 01 00 74 58 31 d2 89 de 48 89 ef e8 0b f9 ff ff 45 85 > >> [ 610.481350] RSP: 0018:ffffb61f06acbb60 EFLAGS: 00000046 > >> [ 610.481354] RAX: ffff9a7cc2658000 RBX: 0000000000010000 RCX: 0000000000000000 > >> [ 610.481358] RDX: 000000000000026e RSI: 0000000000010000 RDI: ffff9a7ce2660000 > >> [ 610.481361] RBP: ffff9a7ce2660000 R08: ffff9a7ce2660f00 R09: ffff9a7ce2660000 > >> [ 610.481364] R10: ffff9a7ce26600c8 R11: ffffffff884d4300 R12: 000000000000026e > >> [ 610.481367] R13: 0000000000000000 R14: 000000000000026d R15: 0000000000000000 > >> [ 610.481371] FS: 0000000000000000(0000) GS:ffff9a7df7cc0000(0000) knlGS:0000000000000000 > >> [ 610.481375] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> [ 610.481378] CR2: 0000563633425300 CR3: 0000000077210006 CR4: 00000000000206e0 > >> [ 610.481382] Call Trace: > >> [ 610.481385] <NMI> > >> [ 610.481389] ? nmi_cpu_backtrace.cold+0x1b/0x76 > >> [ 610.481398] ? nmi_cpu_backtrace_handler+0xd/0x20 > >> [ 610.481403] ? nmi_handle+0x5d/0x120 > >> [ 610.481410] ? mvs_int_rx+0x81/0x150 [mvsas] > >> [ 610.481423] ? default_do_nmi+0x69/0x170 > >> [ 610.481428] ? exc_nmi+0x13c/0x170 > >> [ 610.481432] ? end_repeat_nmi+0x16/0x67 > >> [ 610.481443] ? mvs_int_rx+0x81/0x150 [mvsas] > >> [ 610.481457] ? mvs_int_rx+0x81/0x150 [mvsas] > >> [ 610.481470] ? mvs_int_rx+0x81/0x150 [mvsas] > >> [ 610.481483] </NMI> > >> [ 610.481484] <TASK> > >> [ 610.481487] mvs_do_release_task+0x3f/0x90 [mvsas] > >> [ 610.481501] mvs_release_task+0x13e/0x1a0 [mvsas] > >> [ 610.481516] mvs_I_T_nexus_reset+0xb2/0xd0 [mvsas] > >> [ 610.481530] ? sas_ata_wait_after_reset+0x80/0x80 [libsas] > >> [ 610.481552] sas_ata_hard_reset+0x48/0x80 [libsas] > >> [ 610.481575] ata_eh_reset+0x2e5/0x1090 [libata] > >> [ 610.481631] ? sas_ata_wait_after_reset+0x80/0x80 [libsas] > >> [ 610.481652] ? sas_ata_wait_after_reset+0x80/0x80 [libsas] > >> [ 610.481676] ata_eh_recover+0x2e6/0xe00 [libata] > >> [ 610.481728] ? __wake_up_klogd.part.0+0x56/0x80 > >> [ 610.481735] ? vprintk_emit+0x207/0x290 > >> [ 610.481739] ? smp_ata_check_ready_type+0xb0/0xb0 [libsas] > >> [ 610.481760] ? sas_ata_wait_after_reset+0x80/0x80 [libsas] > >> [ 610.481783] ? smp_ata_check_ready_type+0xb0/0xb0 [libsas] > >> [ 610.481804] ? sas_ata_wait_after_reset+0x80/0x80 [libsas] > >> [ 610.481824] ata_do_eh+0x75/0xf0 [libata] > >> [ 610.481876] ? del_timer_sync+0x6f/0xb0 > >> [ 610.481884] ata_scsi_port_error_handler+0x3a8/0x800 [libata] > >> [ 610.481938] async_sas_ata_eh+0x44/0x7f [libsas] > >> [ 610.481960] async_run_entry_fn+0x30/0x130 > >> [ 610.481966] process_one_work+0x1c7/0x380 > >> [ 610.481974] worker_thread+0x4d/0x380 > >> [ 610.481981] ? rescuer_thread+0x3a0/0x3a0 > >> [ 610.481987] kthread+0xe9/0x110 > >> [ 610.481992] ? kthread_complete_and_exit+0x20/0x20 > >> [ 610.481999] ret_from_fork+0x22/0x30 > >> [ 610.482009] </TASK> > >> [ 665.286198] NMI watchdog: Watchdog detected hard LOCKUP on cpu 11 > Could be due to the libata deadlock without the recent suspend/resume fixes. Or > this is yet another adapter that was not tested for suspend/resume. mpt3sas > crashes the machine 100% of the time as well. I had no time to dig into that issue. > The reporter on Bugzilla [1] said: > Hello again, > 6.6rc7 was unable to resume disks from s3 as expected. > Basically mvsas does not resume the attached devices at all. > The suspend/resume logic was never implemented and nothing happens on resume. It looks like mvsas driver doesn't have S3/S4 logic at all, right? Thanks. [1]: https://bugzilla.kernel.org/show_bug.cgi?id=218030#add_comment -- An old man doll... just what I always wanted! - Clara
Attachment:
signature.asc
Description: PGP signature