On Tue, 2020-12-15 at 16:33 +0800, Fox Chen wrote: > On Mon, Dec 14, 2020 at 9:30 PM Ian Kent <raven@xxxxxxxxxx> wrote: > > On Mon, 2020-12-14 at 14:14 +0800, Fox Chen wrote: > > > On Sun, Dec 13, 2020 at 11:46 AM Ian Kent <raven@xxxxxxxxxx> > > > wrote: > > > > On Fri, 2020-12-11 at 10:17 +0800, Ian Kent wrote: > > > > > On Fri, 2020-12-11 at 10:01 +0800, Ian Kent wrote: > > > > > > > For the patches, there is a mutex_lock in kn->attr_mutex, > > > > > > > as > > > > > > > Tejun > > > > > > > mentioned here > > > > > > > ( > > > > > > > https://lore.kernel.org/lkml/X8fe0cmu+aq1gi7O@xxxxxxxxxxxxxxx/ > > > > > > > ), > > > > > > > maybe a global > > > > > > > rwsem for kn->iattr will be better?? > > > > > > > > > > > > I wasn't sure about that, IIRC a spin lock could be used > > > > > > around > > > > > > the > > > > > > initial check and checked again at the end which would > > > > > > probably > > > > > > have > > > > > > been much faster but much less conservative and a bit more > > > > > > ugly > > > > > > so > > > > > > I just went the conservative path since there was so much > > > > > > change > > > > > > already. > > > > > > > > > > Sorry, I hadn't looked at Tejun's reply yet and TBH didn't > > > > > remember > > > > > it. > > > > > > > > > > Based on what Tejun said it sounds like that needs work. > > > > > > > > Those attribute handling patches were meant to allow taking the > > > > rw > > > > sem read lock instead of the write lock for > > > > kernfs_refresh_inode() > > > > updates, with the added locking to protect the inode attributes > > > > update since it's called from the VFS both with and without the > > > > inode lock. > > > > > > Oh, understood. I was asking also because lock on kn->attr_mutex > > > drags > > > concurrent performance. > > > > > > > Looking around it looks like kernfs_iattrs() is called from > > > > multiple > > > > places without a node database lock at all. > > > > > > > > I'm thinking that, to keep my proposed change straight forward > > > > and on topic, I should just leave kernfs_refresh_inode() taking > > > > the node db write lock for now and consider the attributes > > > > handling > > > > as a separate change. Once that's done we could reconsider > > > > what's > > > > needed to use the node db read lock in kernfs_refresh_inode(). > > > > > > You meant taking write lock of kernfs_rwsem for > > > kernfs_refresh_inode()?? > > > It may be a lot slower in my benchmark, let me test it. > > > > Yes, but make sure the write lock of kernfs_rwsem is being taken > > not the read lock. > > > > That's a mistake I had initially? > > > > Still, that attributes handling is, I think, sufficient to warrant > > a separate change since it looks like it might need work, the > > kernfs > > node db probably should be kept stable for those attribute updates > > but equally the existence of an instantiated dentry might mitigate > > the it. > > > > Some people might just know whether it's ok or not but I would like > > to check the callers to work out what's going on. > > > > In any case it's academic if GCH isn't willing to consider the > > series > > for review and possible merge. > > > Hi Ian > > I removed kn->attr_mutex and changed read lock to write lock for > kernfs_refresh_inode > > down_write(&kernfs_rwsem); > kernfs_refresh_inode(kn, inode); > up_write(&kernfs_rwsem); > > > Unfortunate, changes in this way make things worse, my benchmark > runs > 100% slower than upstream sysfs. :( > open+read+close a sysfs file concurrently took 1000us. (Currently, > sysfs with a big mutex kernfs_mutex only takes ~500us > for one open+read+close operation concurrently) Right, so it does need attention nowish. I'll have a look at it in a while, I really need to get a new autofs release out, and there are quite a few changes, and testing is seeing a number of errors, some old, some newly introduced. It's proving difficult. > > > --45.93%--kernfs_iop_permission > | | > | | | | > | | > | | | > > --22.55%--down_write > | | > | | | | | > | | > | | | | > --20.69%--rwsem_down_write_slowpath > | | > | | | | > | > | | > | | | | > |--8.89%--schedule > > perf showed most of the time had been spent on kernfs_iop_permission > > > thanks, > fox