Hi Michael, On Thu, Aug 31, 2023 at 7:39 AM Michael Ellerman <mpe@xxxxxxxxxxxxxx> wrote: > Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> writes: > > On Fri, 18 Aug 2023, Liam R. Howlett wrote: > >> The current implementation of append may cause duplicate data and/or > >> incorrect ranges to be returned to a reader during an update. Although > >> this has not been reported or seen, disable the append write operation > >> while the tree is in rcu mode out of an abundance of caution. > >> > >> During the analysis of the mas_next_slot() the following was > >> artificially created by separating the writer and reader code: > >> > >> Writer: reader: > >> mas_wr_append > >> set end pivot > >> updates end metata > >> Detects write to last slot > >> last slot write is to start of slot > >> store current contents in slot > >> overwrite old end pivot > >> mas_next_slot(): > >> read end metadata > >> read old end pivot > >> return with incorrect range > >> store new value > >> > >> Alternatively: > >> > >> Writer: reader: > >> mas_wr_append > >> set end pivot > >> updates end metata > >> Detects write to last slot > >> last lost write to end of slot > >> store value > >> mas_next_slot(): > >> read end metadata > >> read old end pivot > >> read new end pivot > >> return with incorrect range > >> set old end pivot > >> > >> There may be other accesses that are not safe since we are now updating > >> both metadata and pointers, so disabling append if there could be rcu > >> readers is the safest action. > >> > >> Fixes: 54a611b60590 ("Maple Tree: add new data structure") > >> Cc: stable@xxxxxxxxxxxxxxx > >> Signed-off-by: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> > > > > Thanks for your patch, which is now commit cfeb6ae8bcb96ccf > > ("maple_tree: disable mas_wr_append() when other readers are > > possible") in v6.5, and is being backported to stable. > > > > On Renesas RZ/A1 and RZ/A2 (single-core Cortex-A9), this causes the > > following warning: > > > > clocksource: timer@e803b000: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 28958491609 ns > > sched_clock: 32 bits at 66MHz, resolution 15ns, wraps every 32537631224ns > > /soc/timer@e803b000: used for clocksource > > /soc/timer@e803c000: used for clock events > > +------------[ cut here ]------------ > > +WARNING: CPU: 0 PID: 0 at init/main.c:992 start_kernel+0x2f0/0x480 > > +Interrupts were enabled early > ... > > > > I do not see this issue on any other platform > > (arm/arm64/risc-v/mips/sh/m68k), several of them use the same > > RCU configuration. > > There's something similar on pmac32 / mac99. > > > Do you have a clue? > > It seems something in the maple tree code is setting TIF_NEED_RESCHED, > and that causes a subsequent call to cond_resched() to call schedule() > and enable interrupts. > > On pmac32 enabling CONFIG_DEBUG_ATOMIC_SLEEP fixes/hides the problem. > But I don't see why. Enabling CONFIG_DEBUG_ATOMIC_SLEEP on RZ/A1 and RZ/A2 does fix the problem. But there must be more to it, as some of my test configs had it enabled, and others hadn't, while only RZ/A showed the issue. I tried disabling it on R-Car M2-W (arm32) and R-Car H3 (arm64), and that did not cause the problem to happen... Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds