Hi, > On Sun, 12 May 2024 07:44:25 -0700, Paul E. McKenney wrote: > > On Sun, May 12, 2024 at 08:02:59AM +0200, John Paul Adrian Glaubitz wrote: > >> On Sat, 2024-05-11 at 18:26 -0700, Paul E. McKenney wrote: > >> > And that breaks things because it can clobber concurrent stores to > >> > other bytes in that enclosing machine word. > >> > >> But pre-EV56 Alpha has always been like this. What makes it broken > >> all of a sudden? > > > > I doubt if it was sudden. Putting concurrently (but rarely) accessed > > small-value quantities into single bytes is a very natural thing to do, > > and I bet that there are quite a few places in the kernel where exactly > > this happens. I happen to know of a specific instance that went into > > mainline about two years ago. > > > > So why didn't the people running current mainline on pre-EV56 Alpha > > systems notice? One possibility is that they are upgrading their > > kernels only occasionally. Another possibility is that they are seeing > > the failures, but are not tracing the obtuse failure modes back to the > > change(s) in question. Yet another possibility is that the resulting > > failures are very low probability, with mean times to failure that are > > so long that you won't notice anything on a single system. > > Another possibility is that the Jensen system was booted into uni processer > mode. Looking at the early boot log [1] provided by Ulrich (+CCed) back in > Sept. 2021, I see the following by running "grep -i cpu": > > >> > [1] https://marc.info/?l=linux-alpha&m=163265555616841&w=2 > > [ 0.000000] Memory: 90256K/131072K available (8897K kernel code, 9499K rwdata, \ > 2704K rodata, 312K init, 437K bss, 40816K reserved, 0K cma-reserved) [ 0.000000] \ > random: get_random_u64 called from __kmem_cache_create+0x54/0x600 with crng_init=0 [ \ > 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 [ 0.000000] > ^^^^^^ > > Without any concurrent atomic updates, the "broken" atomic accesses won't > matter, I guess. I've probably disabled SMP in my test kernel, the jensen is a single CPU system. I never had the pleasure of owning an AlphaServer 2000 or 2100, which (according to https://en.wikipedia.org/wiki/AlphaServer and https://en.wikipedia.org/wiki/AlphaStation) are the only systems with EV4/EV45/EV5 multi-CPU setups (apart from the Cray T3{DE}), so the possibility of ever seeing an error concerning atomic concurrent updates is quite low. Anybody out there with an AlphaServer 2000/2100 willing to try ?-) CU, Uli -- Dipl. Inf. Ulrich Teichert|e-mail: Ulrich.Teichert@xxxxxx | Listening to: Stormweg 24 |The Hives: Two Kinds Of Trouble, The Chats: 6L GTR, 24539 Neumuenster, Germany|La Fraction: Les Démons, Nightwatchers: On a Mission