Re: [PATCH] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save().

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2023/06/22 16:18, Michal Hocko wrote:
>>> It is explained as the first deadlock scenario in commit 1007843a9190
>>> ("mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock").
>>> We have to disable IRQ before making zonelist_update_seq.seqcount odd.
>>>
>>
>> Since we must replace local_irq_save() + write_seqlock() with write_seqlock_irqsave() for
>> CONFIG_PREEMPT_RT=y case but we must not replace local_irq_save() + write_seqlock() with
>> write_seqlock_irqsave() for CONFIG_PREEMPT_RT=n case, the proper fix is something like below?
> 
> Now, I am confused. Why write_seqlock_irqsave is not allowed for !RT?
> Let me quote the changelog and he scenario 1:
>         write_seqlock(&zonelist_update_seq); // makes zonelist_update_seq.seqcount odd
>         // e.g. timer interrupt handler runs at this moment
>           some_timer_func() {
>             kmalloc(GFP_ATOMIC) {
>               __alloc_pages_slowpath() {
>                 read_seqbegin(&zonelist_update_seq) {
>                   // spins forever because zonelist_update_seq.seqcount is odd
>                 }
>               }
>             }
>           }
>         // e.g. timer interrupt handler finishes
>         write_sequnlock(&zonelist_update_seq); // makes zonelist_update_seq.seqcount even
> 
> This is clearly impossible with write_seqlock_irqsave as interrupts are
> disabled before the lock is taken.

Well, it seems that "I don't want to replace" rather than "we must not replace".
I reread the thread but I couldn't find why nobody suggested write_seqlock_irqsave().
The reason I proposed the

  local_irq_save() => printk_deferred_enter() => write_seqlock()

ordering implies a precaution in case write_seqlock() involves printk() (e.g. lockdep,
KCSAN, soft-lockup warning), in addition to "local_irq_save() before printk_deferred_enter()"
requirement. Maybe people in that thread were happy with preserving this precaution...

You commented

  There shouldn't be any other locks (apart from hotplug) taken in that path IIRC.

at https://lkml.kernel.org/ZCrYQj+2/uMtqNBm@xxxxxxxxxxxxxx .

If __build_all_zonelists() is already serialized by hotplug lock, we don't
need to call spin_lock(&zonelist_update_seq.lock) and we will be able to
replace write_seqlock(&zonelist_update_seq) with
write_seqcount_begin(&zonelist_update_seq.seqcount) like
cpuset_change_task_nodemask() does?





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux