Re: [PATCH 0/4] kernfs: proposed locking and concurrency improvement

Ian Kent <raven@xxxxxxxxxx> · Mon, 25 May 2020 15:23:35 +0800

On Mon, 2020-05-25 at 08:16 +0200, Greg Kroah-Hartman wrote:
> On Mon, May 25, 2020 at 01:46:59PM +0800, Ian Kent wrote:
> > For very large systems with hundreds of CPUs and TBs of RAM booting
> > can
> > take a very long time.
> > 
> > Initial reports showed that booting a configuration of several
> > hundred
> > CPUs and 64TB of RAM would take more than 30 minutes and require
> > kernel
> > parameters of udev.children-max=1024
> > systemd.default_timeout_start_sec=3600
> > to prevent dropping into emergency mode.
> > 
> > Gathering information about what's happening during the boot is a
> > bit
> > challenging. But two main issues appeared to be, a large number of
> > path
> > lookups for non-existent files, and high lock contention in the VFS
> > during
> > path walks particularly in the dentry allocation code path.
> > 
> > The underlying cause of this was believed to be the sheer number of
> > sysfs
> > memory objects, 100,000+ for a 64TB memory configuration.
> 
> Independant of your kernfs changes, why do we really need to
> represent
> all of this memory with that many different "memory objects"?  What
> is
> that providing to userspace?
> 
> I remember Ben Herrenschmidt did a lot of work on some of the kernfs
> and
> other functions to make large-memory systems boot faster to remove
> some
> of the complexity in our functions, but that too did not look into
> why
> we needed to create so many objects in the first place.
> 
> Perhaps you might want to look there instead?

I presumed it was a hardware design requirement or IBM VM design
requirement.

Perhaps Rick can find out more on that question.

Ian