On Wed, 2017-02-01 at 09:13 -0800, Jens Axboe wrote: > On 02/01/2017 08:46 AM, Bart Van Assche wrote: > > Thanks for having looked into this. However, after having pulled the latest > > block for-next tree (dbb85b06229f) another lockup was triggered soon (02-sq > > is the name of a shell script of the srp-test suite): > > > > [ 243.021265] sysrq: SysRq : Show Blocked State > > [ 243.021301] task PC stack pid father > > [ 243.022909] 02-sq D 0 10864 10509 0x00000000 > > [ 243.022933] Call Trace: > > [ 243.022956] __schedule+0x2da/0xb00 > > [ 243.022979] schedule+0x38/0x90 > > [ 243.023002] blk_mq_freeze_queue_wait+0x51/0xa0 > > [ 243.023025] ? remove_wait_queue+0x70/0x70 > > [ 243.023047] blk_mq_freeze_queue+0x15/0x20 > > [ 243.023070] elevator_switch+0x24/0x220 > > [ 243.023093] __elevator_change+0xd3/0x110 > > [ 243.023115] elv_iosched_store+0x21/0x60 > > [ 243.023140] queue_attr_store+0x54/0x90 > > [ 243.023164] sysfs_kf_write+0x40/0x50 > > [ 243.023188] kernfs_fop_write+0x137/0x1c0 > > [ 243.023214] __vfs_write+0x23/0x140 > > [ 243.023242] ? rcu_read_lock_sched_held+0x45/0x80 > > [ 243.023265] ? rcu_sync_lockdep_assert+0x2a/0x50 > > [ 243.023287] ? __sb_start_write+0xde/0x200 > > [ 243.023308] ? vfs_write+0x190/0x1e0 > > [ 243.023329] vfs_write+0xc3/0x1e0 > > [ 243.023351] SyS_write+0x44/0xa0 > > [ 243.023373] entry_SYSCALL_64_fastpath+0x18/0xad > > So that's changing the elevator - did this happen while heavy IO was > going to the drive, or was it idle? Hello Jens, The shell command that was used to set the elevator is the following ($realdev is a dm device): echo none > /sys/class/block/$(basename "$realdev")/queue/scheduler I'm not sure whether any I/O was ongoing when the scheduler was being changed from "none" into "none". There are two other processes that got stuck but running lsof against these processes did not reveal what block device these two processes were trying to examine: [ 243.021672] systemd-udevd D 0 10585 504 0x00000000 [ 243.021700] Call Trace: [ 243.021726] __schedule+0x2da/0xb00 [ 243.021749] schedule+0x38/0x90 [ 243.021771] schedule_timeout+0x2fe/0x640 [ 243.021882] io_schedule_timeout+0x9f/0x110 [ 243.021930] wait_on_page_bit_common+0x121/0x1e0 [ 243.021977] generic_file_read_iter+0x17c/0x790 [ 243.022030] blkdev_read_iter+0x30/0x40 [ 243.022053] __vfs_read+0xbb/0x130 [ 243.022075] vfs_read+0xa3/0x170 [ 243.022098] SyS_read+0x44/0xa0 [ 243.022120] entry_SYSCALL_64_fastpath+0x18/0xad [ 243.022298] systemd-udevd D 0 10612 504 0x00000000 [ 243.022320] Call Trace: [ 243.022341] __schedule+0x2da/0xb00 [ 243.022363] schedule+0x38/0x90 [ 243.022383] schedule_timeout+0x2fe/0x640 [ 243.022490] io_schedule_timeout+0x9f/0x110 [ 243.022543] wait_on_page_bit_common+0x121/0x1e0 [ 243.022595] generic_file_read_iter+0x17c/0x790 [ 243.022640] blkdev_read_iter+0x30/0x40 [ 243.022663] __vfs_read+0xbb/0x130 [ 243.022685] vfs_read+0xa3/0x170 [ 243.022707] SyS_read+0x44/0xa0 [ 243.022729] entry_SYSCALL_64_fastpath+0x18/0xad # lsof -p10585 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME systemd-u 10585 root cwd unknown /proc/10585/cwd (readlink: No such file or directory) systemd-u 10585 root rtd unknown /proc/10585/root (readlink: No such file or directory) systemd-u 10585 root txt unknown /proc/10585/exe # lsof -p10612 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME systemd-u 10612 root cwd unknown /proc/10612/cwd (readlink: No such file or directory) systemd-u 10612 root rtd unknown /proc/10612/root (readlink: No such file or directory) systemd-u 10612 root txt unknown /proc/10612/exe Bart.-- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html