On Wed 21-06-23 12:40:34, Sebastian Andrzej Siewior wrote: > __build_all_zonelists() acquires zonelist_update_seq by first disabling > interrupts via local_irq_save() and then acquiring the seqlock with > write_seqlock(). This is troublesome and leads to problems on > PREEMPT_RT because the inner spinlock_t is now acquired with disabled > interrupts. And the spinlock might sleep with PREEMPT_RT so a deadlock, right? It would be better to call that out explicitly > The API provides write_seqlock_irqsave() which does the right thing in > one step. > printk_deferred_enter() has to be invoked in non-migrate-able context to > ensure that deferred printing is enabled and disabled on the same CPU. > This is the case after zonelist_update_seq has been acquired. > > Acquire zonelist_update_seq with write_seqlock_irqsave() and then defer > printk output. > > Fixes: 1007843a91909 ("mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock") > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxxx> Thanks > --- > mm/page_alloc.c | 11 ++++------- > 1 file changed, 4 insertions(+), 7 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 47421bedc12b7..99b7e7d09c5c0 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5808,11 +5808,10 @@ static void __build_all_zonelists(void *data) > unsigned long flags; > > /* > - * Explicitly disable this CPU's interrupts before taking seqlock > - * to prevent any IRQ handler from calling into the page allocator > - * (e.g. GFP_ATOMIC) that could hit zonelist_iter_begin and livelock. > + * The zonelist_update_seq must be acquired with irqsave because the > + * reader can be invoked from IRQ with GFP_ATOMIC. > */ > - local_irq_save(flags); > + write_seqlock_irqsave(&zonelist_update_seq, flags); > /* > * Explicitly disable this CPU's synchronous printk() before taking > * seqlock to prevent any printk() from trying to hold port->lock, for > @@ -5820,7 +5819,6 @@ static void __build_all_zonelists(void *data) > * calling kmalloc(GFP_ATOMIC | __GFP_NOWARN) with port->lock held. > */ > printk_deferred_enter(); > - write_seqlock(&zonelist_update_seq); > > #ifdef CONFIG_NUMA > memset(node_load, 0, sizeof(node_load)); > @@ -5857,9 +5855,8 @@ static void __build_all_zonelists(void *data) > #endif > } > > - write_sequnlock(&zonelist_update_seq); > printk_deferred_exit(); > - local_irq_restore(flags); > + write_sequnlock_irqrestore(&zonelist_update_seq, flags); > } > > static noinline void __init > -- > 2.40.1 -- Michal Hocko SUSE Labs