Re: [PATCH 1/2] block: also mark disk-owned queues as dying in __blk_mark_disk_dead

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On (24/10/19 23:03), Ming Lei wrote:
> On Sat, Oct 19, 2024 at 10:50:10PM +0900, Sergey Senozhatsky wrote:
> > On (24/10/19 21:09), Ming Lei wrote:
> > > On Sat, Oct 19, 2024 at 09:58:04PM +0900, Sergey Senozhatsky wrote:
> > > > On (24/10/19 20:50), Ming Lei wrote:
> > > > > On Sat, Oct 19, 2024 at 09:37:27PM +0900, Sergey Senozhatsky wrote:
> > [..]
> 
> Probably bio_queue_enter() waits for runtime PM, and the queue is in
> ->pm_only state, and BLK_MQ_REQ_PM isn't passed actually from
> ioctl_internal_command() <- scsi_set_medium_removal().

Sorry for the delay.

Another report.
I see lots of buffer I/O errors

<6>[ 364.268167] usb-storage 3-3:1.0: USB Mass Storage device detected
<6>[ 364.268551] scsi host3: usb-storage 3-3:1.0
<3>[ 364.274806] Buffer I/O error on dev sdc1, logical block 0, lost async page write
<5>[ 365.318424] scsi 3:0:0:0: Direct-Access VendorCo ProductCode 2.00 PQ: 0 ANSI: 4
<5>[ 365.319898] sd 3:0:0:0: [sdc] 122880000 512-byte logical blocks: (62.9 GB/58.6 GiB)
<5>[ 365.320077] sd 3:0:0:0: [sdc] Write Protect is off
<7>[ 365.320085] sd 3:0:0:0: [sdc] Mode Sense: 03 00 00 00
<4>[ 365.320255] sd 3:0:0:0: [sdc] No Caching mode page found
<4>[ 365.320262] sd 3:0:0:0: [sdc] Assuming drive cache: write through
<6>[ 365.322483] sdc: sdc1
<5>[ 365.323130] sd 3:0:0:0: [sdc] Attached SCSI removable disk
<6>[ 369.083225] usb 3-3: USB disconnect, device number 49

Then PM suspend/resume.

After resume

<7>[ 1338.847937] PM: resume of devices complete after 291.422 msecs
<6>[ 1338.854215] OOM killer enabled.
<6>[ 1338.854235] Restarting tasks ...
<6>[ 1338.854797] mei_hdcp 0000:00:16.0-(UUID: 7): bound 0000:00:02.0 (ops 0xffffffffb8f03e50)
<6>[ 1338.857745] mei_pxp 0000:00:16.0-(UUID: 2): bound 0000:00:02.0 (ops 0xffffffffb8f16a80)
<4>[ 1338.859663] done.
<5>[ 1338.859683] random: crng reseeded on system resumption
<12>[ 1338.868200] init: cupsd main process ended, respawning
<6>[ 1338.868541] Resume caused by IRQ 9, acpi
<6>[ 1338.868549] Resume caused by IRQ 98, chromeos-ec
<6>[ 1338.868555] PM: suspend exit

lots of buffer I/O errors again and eventually a deadlock.  The deadlock
happens much later than 120 seconds after resume, so I cannot directly
connect those events.

[..]
<6>[ 1859.660882] usb-storage 3-3:1.0: USB Mass Storage device detected
<6>[ 1859.661457] scsi host4: usb-storage 3-3:1.0
<3>[ 1859.668180] Buffer I/O error on dev sdd1, logical block 0, lost async page write
<5>[ 1860.697826] scsi 4:0:0:0: Direct-Access VendorCo ProductCode 2.00 PQ: 0 ANSI: 4
<5>[ 1860.699222] sd 4:0:0:0: [sdd] 122880000 512-byte logical blocks: (62.9 GB/58.6 GiB)
<5>[ 1860.699373] sd 4:0:0:0: [sdd] Write Protect is off
<7>[ 1860.699380] sd 4:0:0:0: [sdd] Mode Sense: 03 00 00 00
<4>[ 1860.699522] sd 4:0:0:0: [sdd] No Caching mode page found
<4>[ 1860.699526] sd 4:0:0:0: [sdd] Assuming drive cache: write through
<6>[ 1860.701393] sdd: sdd1
<5>[ 1860.701886] sd 4:0:0:0: [sdd] Attached SCSI removable disk
<6>[ 1862.077109] usb 3-3: USB disconnect, device number 110
<6>[ 1862.338159] usb 3-3: new high-speed USB device number 111 using xhci_hcd
<6>[ 1862.468090] usb 3-3: New USB device found, idVendor=346d, idProduct=5678, bcdDevice= 2.00
<6>[ 1862.468105] usb 3-3: New USB device strings: Mfr=1, Product=2, SerialNumber=(Serial: 8)
<6>[ 1862.468111] usb 3-3: Product: Disk 2.0
<6>[ 1862.468115] usb 3-3: Manufacturer: USB
<6>[ 1862.468119] usb 3-3: SerialNumber: (Serial: 9)
<6>[ 1862.469962] usb-storage 3-3:1.0: USB Mass Storage device detected
<6>[ 1862.470642] scsi host3: usb-storage 3-3:1.0
<3>[ 1862.476447] Buffer I/O error on dev sdd1, logical block 0, lost async page write
<5>[ 1863.514018] scsi 3:0:0:0: Direct-Access VendorCo ProductCode 2.00 PQ: 0 ANSI: 4
<5>[ 1863.515489] sd 3:0:0:0: [sdd] 122880000 512-byte logical blocks: (62.9 GB/58.6 GiB)
<5>[ 1863.515640] sd 3:0:0:0: [sdd] Write Protect is off
<7>[ 1863.515646] sd 3:0:0:0: [sdd] Mode Sense: 03 00 00 00
<4>[ 1863.515797] sd 3:0:0:0: [sdd] No Caching mode page found
<4>[ 1863.515802] sd 3:0:0:0: [sdd] Assuming drive cache: write through
<6>[ 1863.518227] sdd: sdd1
<5>[ 1863.518551] sd 3:0:0:0: [sdd] Attached SCSI removable disk
<6>[ 1865.018356] usb 3-3: USB disconnect, device number 111
<6>[ 1865.285091] usb 3-3: new high-speed USB device number 112 using xhci_hcd
<3>[ 1865.605088] usb 3-3: device descriptor read/64, error -71
<6>[ 1865.844873] usb 3-3: New USB device found, idVendor=346d, idProduct=5678, bcdDevice= 2.00
<6>[ 1865.844892] usb 3-3: New USB device strings: Mfr=1, Product=2, SerialNumber=(Serial: 8)
<6>[ 1865.844898] usb 3-3: Product: Disk 2.0
<6>[ 1865.844903] usb 3-3: Manufacturer: USB
<6>[ 1865.844906] usb 3-3: SerialNumber: (Serial: 9)
<6>[ 1865.847205] usb-storage 3-3:1.0: USB Mass Storage device detected
<6>[ 1865.847806] scsi host4: usb-storage 3-3:1.0
<3>[ 1865.853941] Buffer I/O error on dev sdd1, logical block 0, lost async page write
<6>[ 1866.436729] usb 3-3: USB disconnect, device number 112
<6>[ 1866.700998] usb 3-3: new high-speed USB device number 113 using xhci_hcd
<6>[ 1866.829449] usb 3-3: New USB device found, idVendor=346d, idProduct=5678, bcdDevice= 2.00
<6>[ 1866.829466] usb 3-3: New USB device strings: Mfr=1, Product=2, SerialNumber=(Serial: 8)
<6>[ 1866.829473] usb 3-3: Product: Disk 2.0
<6>[ 1866.829478] usb 3-3: Manufacturer: USB
<6>[ 1866.829482] usb 3-3: SerialNumber: (Serial: 9)
<6>[ 1866.831605] usb-storage 3-3:1.0: USB Mass Storage device detected
<6>[ 1866.832173] scsi host3: usb-storage 3-3:1.0
<5>[ 1867.866118] scsi 3:0:0:0: Direct-Access VendorCo ProductCode 2.00 PQ: 0 ANSI: 4
<5>[ 1867.868213] sd 3:0:0:0: [sdd] 122880000 512-byte logical blocks: (62.9 GB/58.6 GiB)
<5>[ 1867.868604] sd 3:0:0:0: [sdd] Write Protect is off
<7>[ 1867.868616] sd 3:0:0:0: [sdd] Mode Sense: 03 00 00 00
<4>[ 1867.869071] sd 3:0:0:0: [sdd] No Caching mode page found
<4>[ 1867.869081] sd 3:0:0:0: [sdd] Assuming drive cache: write through
<6>[ 1867.871429] sdd: sdd1
<5>[ 1867.871857] sd 3:0:0:0: [sdd] Attached SCSI removable disk
<6>[ 1868.423593] usb 3-3: USB disconnect, device number 113
<6>[ 1868.431172] sdd: detected capacity change from 122880000 to 0
<28>[ 1928.670962] udevd[203]: sdd: Worker [9839] processing SEQNUM=6508 is taking a long time
<3>[ 2004.633104] INFO: task kworker/0:3:187 blocked for more than 122 seconds.
<3>[ 2004.633125] Tainted: G U 6.6.41-03520-gd3d77f15f842 #1
<3>[ 2004.633131] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6>[ 2004.633149] task:kworker/0:3 state:D stack:0 pid:187 ppid:2 flags:0x00004000
<6>[ 2004.633149] Workqueue: usb_hub_wq hub_event
<6>[ 2004.633166] Call Trace:
<6>[ 2004.633172] <TASK>
<6>[ 2004.633179] schedule+0x4f4/0x1540
<6>[ 2004.633190] ? default_wake_function+0x388/0xcd0
<6>[ 2004.633200] schedule_preempt_disabled+0x15/0x30
<6>[ 2004.633206] __mutex_lock_slowpath+0x2b5/0x4d0
<6>[ 2004.633212] del_gendisk+0x136/0x370
<6>[ 2004.633222] sd_remove+0x30/0x60
<6>[ 2004.633230] device_release_driver_internal+0x1a2/0x2a0
<6>[ 2004.633239] bus_remove_device+0x154/0x180
<6>[ 2004.633248] device_del+0x207/0x370
<6>[ 2004.633256] ? __pfx_transport_remove_classdev+0x10/0x10
<6>[ 2004.633264] ? attribute_container_device_trigger+0xe3/0x110
<6>[ 2004.633272] __scsi_remove_device+0xc0/0x170
<6>[ 2004.633279] scsi_forget_host+0x45/0x60
<6>[ 2004.633287] scsi_remove_host+0x87/0x170
<6>[ 2004.633295] usb_stor_disconnect+0x63/0xb0
<6>[ 2004.633302] usb_unbind_interface+0xbe/0x250
<6>[ 2004.633309] device_release_driver_internal+0x1a2/0x2a0
<6>[ 2004.633315] bus_remove_device+0x154/0x180
<6>[ 2004.633322] device_del+0x207/0x370
<6>[ 2004.633328] ? kobject_release+0x56/0xb0
<6>[ 2004.633336] usb_disable_device+0x72/0x170
<6>[ 2004.633342] usb_disconnect+0xeb/0x280
<6>[ 2004.633350] hub_event+0xac7/0x1760
<6>[ 2004.633359] worker_thread+0x355/0x900
<6>[ 2004.633367] kthread+0xed/0x110
<6>[ 2004.633374] ? __pfx_worker_thread+0x10/0x10
<6>[ 2004.633381] ? __pfx_kthread+0x10/0x10
<6>[ 2004.633387] ret_from_fork+0x38/0x50
<6>[ 2004.633393] ? __pfx_kthread+0x10/0x10
<6>[ 2004.633399] ret_from_fork_asm+0x1b/0x30
<6>[ 2004.633407] </TASK>
<3>[ 2004.633496] INFO: task cros-disks:1614 blocked for more than 122 seconds.
<3>[ 2004.633502] Tainted: G U 6.6.41-03520-gd3d77f15f842 #1
<3>[ 2004.633506] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6>[ 2004.633519] task:cros-disks state:D stack:0 pid:1614 ppid:1 flags:0x00004002
<6>[ 2004.633519] Call Trace:
<6>[ 2004.633523] <TASK>
<6>[ 2004.633527] schedule+0x4f4/0x1540
<6>[ 2004.633533] ? xas_store+0xc57/0xcc0
<6>[ 2004.633539] ? lru_add_drain+0x4d8/0x6e0
<6>[ 2004.633548] blk_queue_enter+0x172/0x250
<6>[ 2004.633557] ? __pfx_autoremove_wake_function+0x10/0x10
<6>[ 2004.633565] blk_mq_alloc_request+0x167/0x210
<6>[ 2004.633573] scsi_execute_cmd+0x65/0x240
<6>[ 2004.633580] ioctl_internal_command+0x6c/0x150
<6>[ 2004.633590] scsi_set_medium_removal+0x63/0xc0
<6>[ 2004.633598] sd_release+0x42/0x50
<6>[ 2004.633606] blkdev_put+0x13b/0x1f0
<6>[ 2004.633615] blkdev_release+0x2b/0x40
<6>[ 2004.633623] __fput_sync+0x9b/0x2c0
<6>[ 2004.633632] __se_sys_close+0x69/0xc0
<6>[ 2004.633639] do_syscall_64+0x60/0x90
<6>[ 2004.633649] ? exit_to_user_mode_prepare+0x49/0x130
<6>[ 2004.633657] ? do_syscall_64+0x6f/0x90
<6>[ 2004.633665] ? do_syscall_64+0x6f/0x90
<6>[ 2004.633672] ? do_syscall_64+0x6f/0x90
<6>[ 2004.633680] ? irq_exit_rcu+0x38/0x90
<6>[ 2004.633687] ? exit_to_user_mode_prepare+0x49/0x130
<6>[ 2004.633694] entry_SYSCALL_64_after_hwframe+0x73/0xdd
<6>[ 2004.633703] RIP: 0033:0x786d55239960
<6>[ 2004.633711] RSP: 002b:00007ffd1c6d8c28 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
<6>[ 2004.633719] RAX: ffffffffffffffda RBX: 00005a5ffe743fd0 RCX: 0000786d55239960
<6>[ 2004.633725] RDX: 0000786d55307b00 RSI: 0000000000000000 RDI: 000000000000000c
<6>[ 2004.633730] RBP: 00007ffd1c6d8d30 R08: 0000000000000007 R09: 00005a5ffe78a9f0
<6>[ 2004.633735] R10: 8a1ecef621fff8a0 R11: 0000000000000202 R12: 0000000000000831
<6>[ 2004.633741] R13: 00005a5ffe743f60 R14: 00005a5ffe743f80 R15: 000000000000000c
<6>[ 2004.633746] </TASK>




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux