Re: [PATCH] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save().

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed 2023-06-21 16:34:21, Sebastian Andrzej Siewior wrote:
> On 2023-06-21 22:32:40 [+0900], Tetsuo Handa wrote:
> > include/linux/seqlock.h says
> …
> > Is above understanding correct?
> 
> That is correct.
> 
> > And you are trying to replace
> > 
> >   local_irq_save(flags);
> >   printk_deferred_enter();
> >   write_seqlock(&zonelist_update_seq);
> > 
> > with
> > 
> >   write_seqlock_irqsave(&zonelist_update_seq, flags);
> >   printk_deferred_enter();
> > 
> > , aren't you?
> 
> correct.
> 
> > But include/linux/printk.h says
> > 
> >   /*
> >    * The printk_deferred_enter/exit macros are available only as a hack for
> >    * some code paths that need to defer all printk console printing. Interrupts
> >    * must be disabled for the deferred duration.
> >    */
> >   #define printk_deferred_enter __printk_safe_enter
> >   #define printk_deferred_exit __printk_safe_exit
> > 
> > which means that local_irq_save() is _required_ before printk_deferred_enter().
> 
> It says that, yes, but that requirement is described as too heavy. The
> requirement is that printk_deferred_enter() happens on the same CPU as
> printk_deferred_exit().

True.

> This can be achieved by an explicit
> local_irq_save(), yes, but also by something like spin_lock_irq() which
> _ensures_ that the task does not migrate to another CPU while the lock
> is acquired. This is the requirement by the current implementation.

Good to know.

> > If local_irq_save() is hidden by your patch, what guarantees that
> > printk_deferred_enter() and printk_deferred_exit() run on the same CPU?
> 
> Because spin_lock_irqsave() on CONFIG_PREEMPT_RT uses migrate_disable().
> The function ensures that the scheduler does not migrate the task to
> another CPU. The CPU is even block from going down (as in CPU-hotplug)
> until the matching migrate_enable() occurs.
> 
> > Also, if local_irq_save() is hidden due to RT, what guarantees that
> > 
> >   write_seqlock_irqsave(&zonelist_update_seq, flags);
> >   <<IRQ>>
> >     some_timer_function() {
> >       printk();
> >     }
> >   <<IRQ>>
> >   printk_deferred_enter();
> >
> > does not happen because write_seqlock_irqsave() does not disable IRQ?
> 
> I don't see how zonelist_update_seq and printk here are connected
> without the port lock/ or memory allocation. But there are two things
> that are different on RT which probably answer your question:
> 
> - If the reader observes an odd sequence number then it acquires the
>   lock of the sequence counter (it is held by the writer) which
>   forces the writer to complete the write critical section and then the
>   reader can continue. There are _no_ memory allocation within a
>   hard IRQ context (as in the actual interrupt). The timer (hrtimer or
>   timer_list timer) are served in task context and we have
>   forced-threaded interrupts. Clearly this means that the seqlock_t (as
>   used here) can only be used task context and not in hard IRQ context.
> 
> - The printk implementation is slightly different and it is being worked
>   to merge it upstream. The two important differences here:
>   - Printing happens by default in a dedicated printing thread.
>   - In emergency cases like panic(), printing happens directly within
>     the invocation of printk(). This requires a so called atomic console
>     which does not use the tty_port::lock.

If I get it correctly, RT solve both possible deadlocks by offloading
the nested operation into a kthread (irq and printk threads).
Plus, printk uses emergency_write() when the kthread is not usable.

If this is true then printk_safe_enter() might be a nop on RT.
All possible deadlocks are prevented either by the kthread or
the console->emergency_write() call.

But wait. AFAIK, the printk kthread is implemented only for the serial
console. So, it won't be safe for other consoles, especially
the problematic tty_insert_flip_string_and_push_buffer() call.

Note that adding printk thread for the graphical consoles will
be a real challenge. We have hard times even with the basic
UART 8250. There are still races possible in the implementation
in the RT tree...

OK, what about using migrate_disable() in printk_deferred_enter()?
Something like:

/*
 * The printk_deferred_enter/exit macros are available only as a hack.
 * They define a per-CPU context where all printk console printing
 * is deferred because it might cause a deadlock otherwise.
 *
 * It is highly recommended to use them only in a context with interrupts
 * disabled. Otherwise, other unrelated printk() calls might be deferred
 * when they interrupt/preempt the deferred code section.
 *
 * They should not be used for to deffer printing of many messages. It might
 * create softlockup when they are flushed later.
 *
 * IMPORTANT: Any new use of these MUST be consulted with printk maintainers.
 *    It might have unexpected side effects on the printk infrastructure.
 */
#ifdef CONFIG_PREEMPT_RT

#define printk_deferred_enter()			\
	do {					\
		migrate_disable();		\
		__printk_safe_enter();		\
	} while (0)

#define printk_deferred_exit()			\
	do {					\
		__printk_safe_exit();		\
		migrate_enable();		\
	} while (0)

#else  /* CONFIG_PREEMPT_RT */

#define printk_deferred_enter()			\
	do {					\
		preempt_disable();		\
		__printk_safe_enter();		\
	} while (0)

#define printk_deferred_exit()			\
	do {					\
		__printk_safe_exit();		\
		preempt_enable();		\
	} while (0)

#endif   /* CONFIG_PREEMPT_RT */

Note that I have used preempt_disable() on non-RT because it is much
cheaper. And IRQs will be disabled anyway on non-RT system
in this code path.

> > Disabling IRQ before incrementing zonelist_update_seq is _required_ for both
> > 
> >   making printk_deferred_enter() safe
> > 
> > and
> > 
> >   making sure that printk_deferred_enter() takes effect
> > 
> > .
> Did I explain why it is sufficient to do
> 	write_seqlock_irqsave()
> 	printk_deferred_enter()
> 
> assuming we have
> 
> | static inline void do_write_seqcount_begin_nested(seqcount_t *s, int subclass)
> | {
> |         seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_);
> |         do_raw_write_seqcount_begin(s);
> | }

Will this prevent any printk() called on the same CPU before
printk_deferred_enter() is called?

Best Regards,
Petr




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux