On Sat, Jul 02, 2022 at 12:08:46PM +0100, Kajetan Puchalski wrote: > On Fri, Jul 01, 2022 at 10:01:10PM +0200, Florian Westphal wrote: > > Kajetan Puchalski <kajetan.puchalski@xxxxxxx> wrote: > > > While running the udp-flood test from stress-ng on Ampere Altra (Mt. > > > Jade platform) I encountered a kernel panic caused by NULL pointer > > > dereference within nf_conntrack. > > > > > > The issue is present in the latest mainline (5.19-rc4), latest stable > > > (5.18.8), as well as multiple older stable versions. The last working > > > stable version I found was 5.15.40. > > > > Do I need a special setup for conntrack? > > I don't think there was any special setup involved, the config I started > from was a generic distribution config and I didn't change any > networking-specific options. In case that's helpful here's the .config I > used. > > https://pastebin.com/Bb2wttdx > > > > > No crashes after more than one hour of stress-ng on > > 1. 4 core amd64 Fedora 5.17 kernel > > 2. 16 core amd64, linux stable 5.17.15 > > 3. 12 core intel, Fedora 5.18 kernel > > 4. 3 core aarch64 vm, 5.18.7-200.fc36.aarch64 > > > > That would make sense, from further experiments I ran it somehow seems > to be related to the number of workers being spawned by stress-ng along > with the CPUs/cores involved. > > For instance, running the test with <=25 workers (--udp-flood 25 etc.) > results in the test running fine for at least 15 minutes. Another point to keep in mind is that modern ARM processors (ARMv8.1 and above) have a more relaxed memory model than older ones (and x86), that can easily exhibit a missing barrier somewhere. I faced this situation already in the past the first time I ran my code on Graviton2, which caused crashes that would never happen on A53/A72/A73 cores nor x86. ARMv8.1 SoCs are not yet widely available for end users like us. A76 is only coming, and A55 has now been available for a bit more than a year. So testing on regular ARM devices like RPi etc may not exhibit such differences. > Running the test with 30 workers results in a panic sometime before it > hits the 15 minute mark. > Based on observations there seems to be a corellation between the number > of workers and how quickly the panic occurs, ie with 30 it takes a few > minutes, with 160 it consistently happens almost immediately. That also > holds for various numbers of workers in between. > > On the CPU/core side of things, the machine in question has two CPU > sockets with 80 identical cores each. All the panics I've encountered > happened when stress-ng was ran directly and unbound. > When I tried using hwloc-bind to bind the process to one of the CPU > sockets, the test ran for 15 mins with 80 and 160 workers with no issues, > no matter which CPU it was bound to. > > Ie the specific circumstances under which it seems to occur are when the > test is able to run across multiple CPU sockets with a large number > of workers being spawned. This could further fuel the possibliity explained above. Regards, Willy