On Thu, Oct 31, 2019 at 02:05:51PM -0700, Darrick J. Wong wrote: > On Fri, Nov 01, 2019 at 07:50:49AM +1100, Dave Chinner wrote: > > On Wed, Oct 30, 2019 at 08:06:58PM -0700, Darrick J. Wong wrote: > > > > In the case of the xfs_bufs, I've been running workloads recently > > > > that cache several million xfs_bufs and only a handful of inodes > > > > rather than the other way around. If we spread inodes because > > > > caching millions on a single node can cause problems on large NUMA > > > > machines, then we also need to spread xfs_bufs... > > > > > > Hmm, could we capture this as a comment somewhere? > > > > Sure, but where? We're planning on getting rid of the KM_ZONE flags > > in the near future, and most of this is specific to the impacts on > > XFS. I could put it in xfs-super.c above where we initialise all the > > slabs, I guess. Probably a separate patch, though.... > > Sounds like a reasonable place (to me) to record the fact that we want > inodes and metadata buffers not to end up concentrating on a single node. Ok. I'll add yet another patch to the preliminary part of the series. Any plans to take any of these first few patches in this cycle? > At least until we start having NUMA systems with a separate "IO node" in > which to confine all the IO threads and whatnot <shudder>. :P Been there, done that, got the t-shirt and wore it out years ago. IO-only nodes (either via software configuration, or real cpu/memory-less IO nodes) are one of the reasons we don't want node-local allocation behaviour for large NUMA configs... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx