Hi Robin During some of the stress tests we also came across a different warning from the arm64 page management code It looks like a race is detected between HW and SW marking a bit in the PTE Not sure it's really related but I thought it might give a clue on the issue http://pastebin.com/ASv19vZP Thanks Yehuda > -----Original Message----- > From: Marcin Wojtas [mailto:mw@xxxxxxxxxxxx] > Sent: Tuesday, May 31, 2016 13:30 > To: Robin Murphy > Cc: linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-arm- > kernel@xxxxxxxxxxxxxxxxxxx; Lior Amsalem; Thomas Petazzoni; Yehuda > Yitschak; Catalin Marinas; Arnd Bergmann; Grzegorz Jaszczyk; Will Deacon; > Nadav Haklai; Tomasz Nowicki; Gregory Clément > Subject: Re: [BUG] Page allocation failures with newest kernels > > Hi Robin, > > > > > I remember there were some issues around 4.2 with the revision of the > > arm64 atomic implementations affecting the cmpxchg_double() in SLUB, > > but those should all be fixed (and the symptoms tended to be > considerably more fatal). > > A stronger candidate would be 97303480753e (which landed in 4.4), > > which has various knock-on effects on the layout of SLUB internals - > > does fiddling with L1_CACHE_SHIFT make any difference? > > > > I'll check the commits, thanks. I forgot to add L1_CACHE_SHIFT was my first > suspect - I had spent a long time debugging network controller, which > stopped working because of this change - L1_CACHE_BYTES (and hence > NET_SKB_PAD) not fitting HW constraints. Anyway reverting it didn't help at > all for page alloc issue. > > Best regards, > Marcin ��.n������g����a����&ޖ)���)��h���&������梷�����Ǟ�m������)������^�����������v���O��zf������