On 07/05/2024 12:14, Ryan Roberts wrote: > On 07/05/2024 12:13, David Hildenbrand wrote: >> >>> https://github.com/intel/lmbench/blob/master/src/lat_mem_rd.c#L95 >>> >>>> suggest. If you want to try something semi-randomly; it might be useful to rule >>>> out the arm64 contpte feature. I don't see how that would be interacting here if >>>> mTHP is disabled (is it?). But its new for 6.9 and arm64 only. Disable with >>>> ARM64_CONTPTE (needs EXPERT) at compile time. >>> I don't enabled mTHP, so it should be not related about ARM64_CONTPTE, >>> but will have a try. >> >> cont-pte can get active if we're just lucky when allocating pages in the right >> order, correct Ryan? > > No it shouldn't do; it requires the pages to be in the same folio. > That said, if we got lucky in allocating the "right" pages, then we will end up doing an extra function call and a bit of maths per every 16 PTEs in order to figure out that the span is not contained by a single folio, before backing out of an attempt to fold. That would probably be just about measurable. But the regression doesn't kick in until 96K, which is the step after 64K. I'd expect to see the regression on 64K too if that was the issue. The cacheline is 64K so I suspect it could be something related to the cache?