Re: Hard and soft lockups with FIO and LTP runs on a large system

Karim Manaouil <kmanaouil.dev@xxxxxxxxx> · Wed, 17 Jul 2024 17:34:03 +0100

On Wed, Jul 17, 2024 at 11:42:31AM +0200, Vlastimil Babka wrote:
> Seems to me it could be (except that ZONE_DMA corner case) a general
> scalability issue in that you tweak some part of the kernel and the
> contention moves elsewhere. At least in MM we have per-node locks so this
> means 256 CPUs per lock? It used to be that there were not that many
> (cores/threads) per a physical CPU and its NUMA node, so many cpus would
> mean also more NUMA nodes where the locks contention would distribute among
> them. I think you could try fakenuma to create these nodes artificially and
> see if it helps for the MM part. But if the contention moves to e.g. an
> inode lock, I'm not sure what to do about that then.

AMD EPYC BIOSes have an option called NPS (Nodes Per Socket) that can be
set to 1, 2, 4 or 8 and that divides the system up into the chosen number
of NUMA nodes.

Karim
PhD Student
Edinburgh University