Re: [PATCH RT 1/2] tasklet: Address a race resulting in double-enqueue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 23, 2020 at 11:55 PM <zanussi@xxxxxxxxxx> wrote:
>
> From: Zhang Xiao <xiao.zhang@xxxxxxxxxxxxx>
>
> v4.19.115-rt49-rc1 stable review patch.
> If anyone has any objections, please let me know.
>
> -----------
>
>
> The kernel bugzilla has the following race condition reported:
>
> CPU0                    CPU1            CPU2
> ------------------------------------------------
> test_set SCHED
>  test_set RUN
>    if SCHED
>     add_list
>     raise
>     clear RUN
> <softirq>
> test_set RUN
> test_clear SCHED
>  ->func
>                         test_set SCHED
> tasklet_try_unlock ->0
> test_clear SCHED
>                                         test_set SCHED
>  ->func
> tasklet_try_unlock ->1
>                                         test_set RUN
>                                         if SCHED
>                                         add list
>                                         raise
>                                         clear RUN
>                         test_set RUN
>                         if SCHED
>                          add list
>                          raise
>                          clear RUN
>
> As a result the tasklet is enqueued on both CPUs and run on both CPUs. Due
> to the nature of the list used here, it is possible that further
> (different) tasklets, which are enqueued after this double-enqueued
> tasklet, are scheduled on CPU2 but invoked on CPU1. It is also possible
> that these tasklets won't be invoked at all, because during the second
> enqueue process the t->next pointer is set to NULL - dropping everything
> from the list.
>
> This race will trigger one or two of the WARN_ON() in
> tasklet_action_common().
> The problem is that the tasklet may be invoked multiple times and clear
> SCHED bit on each invocation. This makes it possible to enqueue the
> very same tasklet on different CPUs.
>
> Current RT-devel is using the upstream implementation which does not
> re-run tasklets if they have SCHED set again and so it does not clear
> the SCHED bit multiple times on a single invocation.
>
> Introduce the CHAINED flag. The tasklet will only be enqueued if the
> CHAINED flag has been set successfully.
> If it is possible to exchange the flags (CHAINED | RUN) -> 0 then the
> tasklet won't be re-run. Otherwise the possible SCHED flag is removed
> and the tasklet is re-run again.
>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=61451
> Not-signed-off-by: Zhang Xiao <xiao.zhang@xxxxxxxxxxxxx>
> [bigeasy: patch description]
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
>
> Signed-off-by: Tom Zanussi <zanussi@xxxxxxxxxx>
> ---
>  include/linux/interrupt.h | 5 ++++-
>  kernel/softirq.c          | 6 +++++-
>  2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index 97d9ba26915e..a3b5edb26bc5 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -579,12 +579,15 @@ enum
>  {
>         TASKLET_STATE_SCHED,    /* Tasklet is scheduled for execution */
>         TASKLET_STATE_RUN,      /* Tasklet is running (SMP only) */
> -       TASKLET_STATE_PENDING   /* Tasklet is pending */
> +       TASKLET_STATE_PENDING,  /* Tasklet is pending */
> +       TASKLET_STATE_CHAINED   /* Tasklet is chained */
>  };
>
>  #define TASKLET_STATEF_SCHED   (1 << TASKLET_STATE_SCHED)
>  #define TASKLET_STATEF_RUN     (1 << TASKLET_STATE_RUN)
>  #define TASKLET_STATEF_PENDING (1 << TASKLET_STATE_PENDING)
> +#define TASKLET_STATEF_CHAINED (1 << TASKLET_STATE_CHAINED)
> +#define TASKLET_STATEF_RC      (TASKLET_STATEF_RUN | TASKLET_STATEF_CHAINED)
>
>  #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT_FULL)
>  static inline int tasklet_trylock(struct tasklet_struct *t)
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 25bcf2f2714b..73dae64bfc9c 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -947,6 +947,10 @@ static void __tasklet_schedule_common(struct tasklet_struct *t,
>          * is locked before adding it to the list.
>          */
>         if (test_bit(TASKLET_STATE_SCHED, &t->state)) {
> +               if (test_and_set_bit(TASKLET_STATE_CHAINED, &t->state)) {
> +                       tasklet_unlock(t);
> +                       return;
> +               }
>                 t->next = NULL;
>                 *head->tail = t;
>                 head->tail = &(t->next);
> @@ -1040,7 +1044,7 @@ static void tasklet_action_common(struct softirq_action *a,
>  again:
>                 t->func(t->data);
>
> -               while (!tasklet_tryunlock(t)) {
> +               while (cmpxchg(&t->state, TASKLET_STATEF_RC, 0) != TASKLET_STATEF_RC) {
>                         /*
>                          * If it got disabled meanwhile, bail out:
>                          */
> --
> 2.17.1
>

Hi, This patch introduced a regression in our kernel
(v4.19.124-rt53-rebase), It occurs when we're jumping to crush kernel
using kexec, in the initialization of the emmc driver.
I'm still debugging the root cause, but I thought of mentioning this
in the mailing list if you have any idea why this could occur.
The issue doesn't happen on normal boot, only when I specifically
crash the kernel into the crash kernel.
Thanks,
Ramon.

[    0.546142] macb 2b00000.eth1 eth1: Cadence GEM rev 0x00070200 at
0x02b00000 irq 16 (00:28:f8:b7:a0:40)
[    0.548041] sdhci: Secure Digital Host Controller Interface driver
[    0.548042] sdhci: Copyright(c) Pierre Ossman
[    0.548044] sdhci-pltfm: SDHCI platform and OF driver helper
[    0.580588] mmc0: SDHCI controller on 2200000.sdhci [2200000.sdhci]
using ADMA 64-bit
[    0.605753] hm, tasklet state: 00000008
[    0.605757] ------------[ cut here ]------------
[    0.605763] WARNING: CPU: 0 PID: 1 at ../kernel/softirq.c:1061
0xa80000080282649c
[    0.605771] CPU: 0 PID: 1 Comm: swapper Not tainted
4.19.124.eyeq5_ssnt-local-rt53-07819-g3519e8cf4110 #4
[    0.605774] Stack : 0000000000000000 0000000000000018
a80000080340bd40 0000000002d40000
[    0.605780]         a80000080340be70 0000000000000000
0000000000000000 000000000000007e
[    0.605785]         0000000000000000 0000000000000001
0000000000000001 0000000002d30000
[    0.605790]         0000000000000000 a800000802b2a688
0000000002d30000 0000000002d30000
[    0.605795]         0000000000000000 0000000000000000
a80000080282649c a800000802b6cf28
[    0.605799]         0000000000000009 0000000000000425
0000000000000000 0000000000000007
[    0.605804]         0000000002d30000 0000000002d30000
0000000000000000 a800000802d20000
[    0.605809]         a800000803430000 a80000080340bd40
a800000802d284c0 a80000080280a6c4
[    0.605813]         0000000000000000 0000000000000000
0000000000000000 0000000000000000
[    0.605818]         0000000000000000 a80000080280a6c8
0000000000000000 a800000802823b0c
[    0.605823]         ...
[    0.605825] Call Trace:
[    0.605830] [<a800000802b2a688>] 0xa800000802b2a688
[    0.605833] [<a80000080282649c>] 0xa80000080282649c
[    0.605835] [<a80000080280a6c4>] 0xa80000080280a6c4
[    0.605838] [<a80000080280a6c8>] 0xa80000080280a6c8
[    0.605840] [<a800000802823b0c>] 0xa800000802823b0c
[    0.605842] [<a80000080282649c>] 0xa80000080282649c
[    0.605845] [<a800000802b30394>] 0xa800000802b30394
[    0.605847] [<a8000008029cd2ec>] 0xa8000008029cd2ec
[    0.605850] [<a800000802805490>] 0xa800000802805490
[    0.605851]
[    0.605854] ---[ end trace 293afee709e91e0b ]---
[    0.607548] stn8500_serial 800000.uart: detected port #0
[    0.607576] stn8500_serial 800000.uart: uartclk 125000000
[    0.607605] 800000.uart: ttyST0 at MMIO 0x800000 (irq = 46,
base_baud = 7812500) is a stn8500
[    0.610303] stn8500_serial 900000.uart: detected port #1
[    0.610324] stn8500_serial 900000.uart: uartclk 125000000
[    0.611564] 900000.uart: ttyST1 at MMIO 0x900000 (irq = 46,
base_baud = 7812500) is a stn8500
[    0.614645] stn8500_serial a00000.uart: detected port #2
[    0.614891] stn8500_serial a00000.uart: uartclk 125000000
[    0.614919] a00000.uart: ttyST2 at MMIO 0xa00000 (irq = 46,
base_baud = 7812500) is a stn8500
[    0.614930] stn8500_serial a00000.uart: console setup on port #2 :
uartclk 125000000 115207-n-8-n
[    0.615943] console [ttyST2] enabled
[    0.615945] bootconsole [stn8500] disabled
[    0.620959] Freeing unused kernel memory: 1324K
[    0.620965] This architecture does not have kernel memory protection.
[    0.620967] Run /init as init process
[    0.710641] random: dd: uninitialized urandom read (512 bytes read)



[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux