Re: 6.6 kernel block: blk_mq_freeze_queue_wait in suspend path but userspace task held the queue->q_usage_counter in 'TASK_FROZEN' state

Kassey Li <quic_yingangl@xxxxxxxxxxx> · Fri, 21 Jun 2024 09:39:47 +0800

On 2024/6/21 0:52, Bart Van Assche wrote:
On 6/19/24 11:53 PM, Kassey Li wrote:
hello, linux block team:

Please repost this message on the linux-scsi mailing list. I think this
is a UFSHCD driver issue rather than a block layer issue.

userspace task A  ['TASK_FROZEN']

     [<ffffffdc0c527f10>] __switch_to+0x1e8
     [<ffffffdc0c5287ec>] __schedule+0x6cc
     [<ffffffdc0c528c28>] schedule+0x78
     [<ffffffdc0c532bbc>] schedule_timeout+0x50
     [<ffffffdc0c529e48>] do_wait_for_common+0x10c
     [<ffffffdc0c529238>] wait_for_completion+0x48
     [<ffffffdc0b4def4c>] __flush_work+0xcc
     [<ffffffdc0b4dee70>] flush_work+0x14
     [<ffffffdc0c091aec>] ufshcd_hold+0xc0

Is ufshcd_hold() executing flush_work(&hba->clk_gating.ungate_work)?
If so, why does the ungate work not complete?

userspace task A (pid 9453) sleep on 1919.065798809
to __flush_work, where it  will insert wq_barrier_func

	Line 4289153:      kworker/u19:2-20272 [001] ..... 1919.091512: 
workqueue_execute_start work_struct 0xffffff881cbba090 function 
0xffffffdc0c0a647c ('ufshcd_ungate_work', 0)
	Line 4289842:      kworker/u19:2-20272 [001] ..... 1919.096289: 
workqueue_execute_end work_struct 0xffffff881cbba090 function 
0xffffffdc0c0a647c ('ufshcd_ungate_work', 0)
	Line 4289843:      kworker/u19:2-20272   [001] .....  1919.096291: 
workqueue_execute_start  work_struct 0xffffffc0ceddb4f0 function 
0xffffffdc0b4e322c ('wq_barrier_func', 0)
	Line 4289844:      kworker/u19:2-20272   [001] .....  1919.096293: 
workqueue_execute_end  work_struct 0xffffffc0ceddb4f0 function 
0xffffffdc0b4e322c ('wq_barrier_func', 0)

	the ftrace log showed ufshcd_ungate_work , wq_barrier_func were all done.

	a sched_waking should happened for pid=9453 between 1919.096291 to 
1919.096293
		since 
wq_barrier_func->complete->swake_up_locked->try_to_wake_up->ttwu_state_match
	however, ttwu_state_match only wake up the task in TASK_NORMAL state.

here pid=9453 in TASK_FROZEN is not the case.

do you have some suggest on break this ?

Thanks,

Bart.