Re: [PATCH printk v5 1/1] printk: extend console_lock for per-console locking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

On 27.04.2022 09:08, Marek Szyprowski wrote:
> On 26.04.2022 15:16, Petr Mladek wrote:
>> On Tue 2022-04-26 14:07:42, Petr Mladek wrote:
>>> On Mon 2022-04-25 23:04:28, John Ogness wrote:
>>>> Currently threaded console printers synchronize against each
>>>> other using console_lock(). However, different console drivers
>>>> are unrelated and do not require any synchronization between
>>>> each other. Removing the synchronization between the threaded
>>>> console printers will allow each console to print at its own
>>>> speed.
>>>>
>>>> But the threaded consoles printers do still need to synchronize
>>>> against console_lock() callers. Introduce a per-console mutex
>>>> and a new console boolean field @blocked to provide this
>>>> synchronization.
>>>>
>>>> console_lock() is modified so that it must acquire the mutex
>>>> of each console in order to set the @blocked field. Console
>>>> printing threads will acquire their mutex while printing a
>>>> record. If @blocked was set, the thread will go back to sleep
>>>> instead of printing.
>>>>
>>>> The reason for the @blocked boolean field is so that
>>>> console_lock() callers do not need to acquire multiple console
>>>> mutexes simultaneously, which would introduce unnecessary
>>>> complexity due to nested mutex locking. Also, a new field
>>>> was chosen instead of adding a new @flags value so that the
>>>> blocked status could be checked without concern of reading
>>>> inconsistent values due to @flags updates from other contexts.
>>>>
>>>> Threaded console printers also need to synchronize against
>>>> console_trylock() callers. Since console_trylock() may be
>>>> called from any context, the per-console mutex cannot be used
>>>> for this synchronization. (mutex_trylock() cannot be called
>>>> from atomic contexts.) Introduce a global atomic counter to
>>>> identify if any threaded printers are active. The threaded
>>>> printers will also check the atomic counter to identify if the
>>>> console has been locked by another task via console_trylock().
>>>>
>>>> Note that @console_sem is still used to provide synchronization
>>>> between console_lock() and console_trylock() callers.
>>>>
>>>> A locking overview for console_lock(), console_trylock(), and the
>>>> threaded printers is as follows (pseudo code):
>>>>
>>>> console_lock()
>>>> {
>>>>          down(&console_sem);
>>>>          for_each_console(con) {
>>>>                  mutex_lock(&con->lock);
>>>>                  con->blocked = true;
>>>>                  mutex_unlock(&con->lock);
>>>>          }
>>>>          /* console_lock acquired */
>>>> }
>>>>
>>>> console_trylock()
>>>> {
>>>>          if (down_trylock(&console_sem) == 0) {
>>>>                  if (atomic_cmpxchg(&console_kthreads_active, 0, 
>>>> -1) == 0) {
>>>>                          /* console_lock acquired */
>>>>                  }
>>>>          }
>>>> }
>>>>
>>>> threaded_printer()
>>>> {
>>>>          mutex_lock(&con->lock);
>>>>          if (!con->blocked) {
>>>>         /* console_lock() callers blocked */
>>>>
>>>>                  if 
>>>> (atomic_inc_unless_negative(&console_kthreads_active)) {
>>>>                          /* console_trylock() callers blocked */
>>>>
>>>>                          con->write();
>>>>
>>>> atomic_dec(&console_lock_count);
>>>>                  }
>>>>          }
>>>>          mutex_unlock(&con->lock);
>>>> }
>>>>
>>>> The console owner and waiter logic now only applies between contexts
>>>> that have taken the console_lock via console_trylock(). Threaded
>>>> printers never take the console_lock, so they do not have a
>>>> console_lock to handover. Tasks that have used console_lock() will
>>>> block the threaded printers using a mutex and if the console_lock
>>>> is handed over to an atomic context, it would be unable to unblock
>>>> the threaded printers. However, the console_trylock() case is
>>>> really the only scenario that is interesting for handovers anyway.
>>>>
>>>> @panic_console_dropped must change to atomic_t since it is no longer
>>>> protected exclusively by the console_lock.
>>>>
>>>> Since threaded printers remain asleep if they see that the console
>>>> is locked, they now must be explicitly woken in __console_unlock().
>>>> This means wake_up_klogd() calls following a console_unlock() are
>>>> no longer necessary and are removed.
>>>>
>>>> Also note that threaded printers no longer need to check
>>>> @console_suspended. The check for the @blocked field implicitly
>>>> covers the suspended console case.
>>>>
>>>> Signed-off-by: John Ogness <john.ogness@xxxxxxxxxxxxx>
>>> Nice, it it better than v4. I am going to push this for linux-next.
>>>
>>> Reviewed-by: Petr Mladek <pmladek@xxxxxxxx>
>> JFYI, I have just pushed this patch instead of the one
>> from v4 into printk/linux.git, branch rework/kthreads.
>>
>> It means that this branch has been rebased. It will be
>> used in the next refresh of linux-next.
>
> This patchset landed in linux next-20220426. In my tests I've found 
> that it causes deadlock on all my Amlogic Meson G12B/SM1 based boards: 
> Odroid C4/N2 and Khadas VIM3/VIM3l. The deadlock happens when system 
> boots to userspace and getty (with automated login) is executed. I 
> even see the bash prompt, but then the console is freezed. Reverting 
> this patch (e00cc0e1cbf4) on top of linux-next (together with 
> 6b3d71e87892 to make revert clean) fixes the issue.
>
The Amlogic Meson related issue has been investigated and fixed:

https://lore.kernel.org/all/b7c81f02-039e-e877-d7c3-6834728d2117@xxxxxxxxxxx/

but I just found that there is one more issue.


It appears on QCom-based DragonBoard 410c SBC 
(arch/arm64/boot/dts/qcom/apq8016-sbc.dts). To see it on today's linux 
next-20220506, one has to revert 
42cd402b8fd4672b692400fe5f9eecd55d2794ac, otherwise lockdep triggers 
other warning and it is turned off too early:

================================
WARNING: inconsistent lock state
5.18.0-rc5-next-20220506+ #11869 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
ffff80000aaa8478 (&port_lock_key){?.+.}-{2:2}, at: msm_uart_irq+0x38/0x750
{HARDIRQ-ON-W} state was registered at:
   lock_acquire.part.0+0xe0/0x230
   lock_acquire+0x68/0x84
   _raw_spin_lock+0x5c/0x80
   __msm_console_write+0x1ac/0x220
   msm_console_write+0x48/0x60
   __console_emit_next_record+0x188/0x420
   printk_kthread_func+0x3a0/0x3bc
   kthread+0x118/0x11c
   ret_from_fork+0x10/0x20
irq event stamp: 12182
hardirqs last  enabled at (12181): [<ffff800008e3d2a8>] 
cpuidle_enter_state+0xc4/0x30c

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.18.0-rc5-next-20220506+ #11869
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
Call trace:
  dump_backtrace.part.0+0xd0/0xe0
  show_stack+0x18/0x6c
  dump_stack_lvl+0x8c/0xb8
  dump_stack+0x18/0x34
  print_usage_bug.part.0+0x208/0x22c
  mark_lock+0x710/0x954
  __lock_acquire+0x9fc/0x20cc
  lock_acquire.part.0+0xe0/0x230
  lock_acquire+0x68/0x84
  _raw_spin_lock_irqsave+0x80/0xcc
  msm_uart_irq+0x38/0x750
  __handle_irq_event_percpu+0xac/0x3d0
  handle_irq_event+0x4c/0x120
  handle_fasteoi_irq+0xa4/0x1a0
  generic_handle_domain_irq+0x3c/0x60
  gic_handle_irq+0x44/0xc4
  call_on_irq_stack+0x2c/0x54
  do_interrupt_handler+0x80/0x84
  el1_interrupt+0x34/0x64
  el1h_64_irq_handler+0x18/0x24
  el1h_64_irq+0x64/0x68
  cpuidle_enter_state+0xcc/0x30c
  cpuidle_enter+0x38/0x50
  do_idle+0x22c/0x2bc
  cpu_startup_entry+0x28/0x30
  rest_init+0x110/0x190
  arch_post_acpi_subsys_init+0x0/0x18
  start_kernel+0x6c4/0x704
  __primary_switched+0xc0/0xc8
  INIT: version 2.88 booting
[info] Using makefile-style concurrent boot in runlevel S.


Reverting the following patches on top of linux next-20220506 
fixes/hides this lockdep warning:

6b3d71e87892 ("printk: remove @console_locked")
8e274732115f ("printk: extend console_lock for per-console locking")
09c5ba0aa2fc ("printk: add kthread console printers")


Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux