On 6/3/22 07:51, Tyler Erickson wrote: > This patch series fixes reading the concurrent positioning ranges. > > The first patch fixes reading this in libata, where it was reading > more data than a drive necessarily supports, resulting in a > command abort. > > The second patch fixes the SCSI translated data to put the VPD page > length in the correct starting byte. > > The third patch in sd, the fix is adding 4 instead of 3 for the header > length. Using 3 will always result in an error and was likely used > incorrectly as T10 specifications list all tables starting at byte 0, > and byte 3 is the page length, which would mean there are 4 total > bytes before the remaining data that contains the ranges and other > information. > > Tyler Erickson (3): > libata: fix reading concurrent positioning ranges log > libata: fix translation of concurrent positioning ranges > scsi: sd: Fix interpretation of VPD B9h length > > drivers/ata/libata-core.c | 21 +++++++++++++-------- > drivers/ata/libata-scsi.c | 2 +- > drivers/scsi/sd.c | 2 +- > 3 files changed, 15 insertions(+), 10 deletions(-) > > > base-commit: 700170bf6b4d773e328fa54ebb70ba444007c702 Looks all good to me. I tested this and really wonder how I did not catch these mistakes earlier :) Using a tcmu emulator for various concurrent positioning range configs to test, I got a lockdep splat when unplugging the drive: [ 760.702859] ====================================================== [ 760.702863] WARNING: possible circular locking dependency detected [ 760.702868] 5.18.0+ #1509 Not tainted [ 760.702875] ------------------------------------------------------ [...] [ 760.702966] the existing dependency chain (in reverse order) is: [ 760.702969] [ 760.702969] -> #1 (&q->sysfs_lock){+.+.}-{3:3}: [ 760.702982] __mutex_lock+0x15b/0x1480 [ 760.702998] blk_ia_range_sysfs_show+0x41/0xc0 [ 760.703010] sysfs_kf_seq_show+0x1f2/0x360 [ 760.703022] seq_read_iter+0x40f/0x1130 [ 760.703036] new_sync_read+0x2d8/0x520 [ 760.703049] vfs_read+0x31a/0x450 [ 760.703060] ksys_read+0xf7/0x1d0 [ 760.703070] do_syscall_64+0x34/0x80 [ 760.703081] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 760.703093] [ 760.703093] -> #0 (kn->active#385){++++}-{0:0}: [ 760.703108] __lock_acquire+0x2ba6/0x6a20 [ 760.703125] lock_acquire+0x19f/0x510 [ 760.703136] __kernfs_remove+0x739/0x940 [ 760.703145] kernfs_remove_by_name_ns+0x90/0xe0 [ 760.703154] remove_files+0x8c/0x1a0 [ 760.703165] sysfs_remove_group+0x77/0x150 [ 760.703175] sysfs_remove_groups+0x4f/0x90 [ 760.703186] __kobject_del+0x7d/0x1b0 [ 760.703196] kobject_del+0x31/0x50 [ 760.703203] disk_unregister_independent_access_ranges+0x153/0x290 [ 760.703214] blk_unregister_queue+0x166/0x210 [ 760.703226] del_gendisk+0x2f8/0x7c0 [ 760.703233] sd_remove+0x5e/0xb0 [sd_mod] [ 760.703252] device_release_driver_internal+0x3ad/0x750 [ 760.703262] bus_remove_device+0x2a6/0x570 [ 760.703269] device_del+0x48f/0xb50 [ 760.703280] __scsi_remove_device+0x21b/0x2b0 [scsi_mod] [ 760.703339] scsi_remove_device+0x3a/0x50 [scsi_mod] [ 760.703391] tcm_loop_port_unlink+0xca/0x160 [tcm_loop] [ 760.703407] target_fabric_port_unlink+0xd5/0x120 [target_core_mod] [ 760.703494] configfs_unlink+0x37f/0x7a0 [ 760.703502] vfs_unlink+0x295/0x800 [ 760.703514] do_unlinkat+0x2d9/0x560 [ 760.703520] __x64_sys_unlink+0xa5/0xf0 [ 760.703528] do_syscall_64+0x34/0x80 [ 760.703537] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 760.703548] [ 760.703548] other info that might help us debug this: [ 760.703548] [ 760.703551] Possible unsafe locking scenario: [ 760.703551] [ 760.703554] CPU0 CPU1 [ 760.703556] ---- ---- [ 760.703558] lock(&q->sysfs_lock); [ 760.703565] lock(kn->active#385); [ 760.703573] lock(&q->sysfs_lock); [ 760.703579] lock(kn->active#385); [ 760.703587] [ 760.703587] *** DEADLOCK *** This needs to be checked too, but that is not related to your fixes. I will queue the libata patches for rc1 update. Martin, Do you want to take patch 3 or should I just take it ? -- Damien Le Moal Western Digital Research