Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/1/21 6:11 PM, Jens Axboe wrote:
> On 3/1/21 5:57 PM, Alex Xu (Hello71) wrote:
>> Hi,
>>
>> On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for 
>> about 40 seconds and then continues operation. The following messages 
>> are printed to the kernel log:
>>
>> [  240.650300] PM: suspend entry (deep)
>> [  240.650748] Filesystems sync: 0.000 seconds
>> [  240.725605] Freezing user space processes ...
>> [  260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks refusing to freeze, wq_busy=0):
>> [  260.739497] task:iou-mgr-446     state:S stack:    0 pid:  516 ppid:   439 flags:0x00004224
>> [  260.739504] Call Trace:
>> [  260.739507]  ? sysvec_apic_timer_interrupt+0xb/0x81
>> [  260.739515]  ? pick_next_task_fair+0x197/0x1cde
>> [  260.739519]  ? sysvec_reschedule_ipi+0x2f/0x6a
>> [  260.739522]  ? asm_sysvec_reschedule_ipi+0x12/0x20
>> [  260.739525]  ? __schedule+0x57/0x6d6
>> [  260.739529]  ? del_timer_sync+0xb9/0x115
>> [  260.739533]  ? schedule+0x63/0xd5
>> [  260.739536]  ? schedule_timeout+0x219/0x356
>> [  260.739540]  ? __next_timer_interrupt+0xf1/0xf1
>> [  260.739544]  ? io_wq_manager+0x73/0xb1
>> [  260.739549]  ? io_wq_create+0x262/0x262
>> [  260.739553]  ? ret_from_fork+0x22/0x30
>> [  260.739557] task:iou-mgr-517     state:S stack:    0 pid:  522 ppid:   439 flags:0x00004224
>> [  260.739561] Call Trace:
>> [  260.739563]  ? sysvec_apic_timer_interrupt+0xb/0x81
>> [  260.739566]  ? pick_next_task_fair+0x16f/0x1cde
>> [  260.739569]  ? sysvec_apic_timer_interrupt+0xb/0x81
>> [  260.739571]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
>> [  260.739574]  ? __schedule+0x5b7/0x6d6
>> [  260.739578]  ? del_timer_sync+0x70/0x115
>> [  260.739581]  ? schedule_timeout+0x211/0x356
>> [  260.739585]  ? __next_timer_interrupt+0xf1/0xf1
>> [  260.739588]  ? io_wq_check_workers+0x15/0x11f
>> [  260.739592]  ? io_wq_manager+0x69/0xb1
>> [  260.739596]  ? io_wq_create+0x262/0x262
>> [  260.739600]  ? ret_from_fork+0x22/0x30
>> [  260.739603] task:iou-wrk-517     state:S stack:    0 pid:  523 ppid:   439 flags:0x00004224
>> [  260.739607] Call Trace:
>> [  260.739609]  ? __schedule+0x5b7/0x6d6
>> [  260.739614]  ? schedule+0x63/0xd5
>> [  260.739617]  ? schedule_timeout+0x219/0x356
>> [  260.739621]  ? __next_timer_interrupt+0xf1/0xf1
>> [  260.739624]  ? task_thread.isra.0+0x148/0x3af
>> [  260.739628]  ? task_thread_unbound+0xa/0xa
>> [  260.739632]  ? task_thread_bound+0x7/0x7
>> [  260.739636]  ? ret_from_fork+0x22/0x30
>> [  260.739647] OOM killer enabled.
>> [  260.739648] Restarting tasks ... done.
>> [  260.740077] PM: suspend exit
>>
>> and then a set of similar messages except with s2idle instead of deep.
>>
>> Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of 
>> git://git.kernel.dk/linux-block") appears to resolve the issue. I have 
>> not yet bisected further. Let me know which troubleshooting steps I 
>> should perform next.
> 
> Can you try and pull in:
> 
> git://git.kernel.dk/linux-block io_uring-5.12
> 
> and see if that resolves it? I usually always run -git on my laptop as
> well, but something broke it in the merge window so I need to figure
> out what that is first...
> 
> What distro are you running?

You probably want this on top...


diff --git a/fs/io-wq.c b/fs/io-wq.c
index 1fdb2b621b51..a763e1b09073 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -567,7 +567,7 @@ static int task_thread(void *data, int index)
 	worker->task = current;
 
 	set_cpus_allowed_ptr(current, cpumask_of_node(wqe->node));
-	current->flags |= PF_NO_SETAFFINITY;
+	current->flags |= PF_NO_SETAFFINITY | PF_NOFREEZE;
 
 	raw_spin_lock_irq(&wqe->lock);
 	hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
@@ -722,7 +722,7 @@ static int io_wq_manager(void *data)
 
 	sprintf(buf, "iou-mgr-%d", wq->task_pid);
 	set_task_comm(current, buf);
-	current->flags |= PF_IO_WORKER;
+	current->flags |= PF_IO_WORKER | PF_NOFREEZE;
 	wq->manager = get_task_struct(current);
 
 	complete(&wq->started);
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2757675ab417..e7aaf56b4dea 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6679,6 +6685,7 @@ static int io_sq_thread(void *data)
 	set_task_comm(current, buf);
 	sqd->thread = current;
 	current->pf_io_worker = NULL;
+	current->flags |= PF_NOFREEZE;
 
 	if (sqd->sq_cpu != -1)
 		set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu));

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux