On 12/6/23, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > On Wed, Dec 06, 2023 at 05:42:34PM +0100, Mateusz Guzik wrote: > >> That is to say your patchset is probably an improvement, but this >> benchmark uses kernfs which is a total crapper, with code like this in >> kernfs_iop_permission: >> >> root = kernfs_root(kn); >> >> down_read(&root->kernfs_iattr_rwsem); >> kernfs_refresh_inode(kn, inode); >> ret = generic_permission(&nop_mnt_idmap, inode, mask); >> up_read(&root->kernfs_iattr_rwsem); >> >> >> Maybe there is an easy way to dodge this, off hand I don't see one. > > At a guess - seqcount on kernfs nodes, bumped on metadata changes > and a seqretry loop, not that this was the only problem with kernfs > scalability. > I assumed you can't have possibly changing inode fields around generic_permission. > That might account for sysinfo side, but not the unixbench - no kernfs > locks mentioned there. OTOH, we might be hitting the wall on > ->i_rwsem with what it's doing... > I did not see anything about unixbench, the subject only talks about stressng. That said now I'm curious enough what's going on here to give it a serious poke instead of a quick glance. -- Mateusz Guzik <mjguzik gmail.com>