Hi, I found this series of patches solves exact the problem I am trying to solve. https://lore.kernel.org/lkml/20201202145837.48040-1-foxhlchen@xxxxxxxxx/ The problem is reported by Brice Goglin on thread: Re: [PATCH 1/4] drivers core: Introduce CPU type sysfs interface https://lore.kernel.org/lkml/X60dvJoT4fURcnsF@xxxxxxxxx/ I independently comfirmed that on a 96-core AWS c5.metal server. Do open+read+write on /sys/devices/system/cpu/cpu15/topology/core_id 1000 times. With a single thread it takes ~2.5 us for each open+read+close. With one thread per core, 96 threads running simultaneously takes 540 us for each of the same operation (without much variation) -- 200x slower than the single thread one. My Benchmark code is here: https://github.com/foxhlchen/sysfs_benchmark The problem can only be observed in large machines (>=16 cores). The more cores you have the slower it can be. Perf shows that CPUs spend most of the time (>80%) waiting on mutex locks in kernfs_iop_permission and kernfs_dop_revalidate. After applying this, performance gets huge boost -- with the fastest one at ~30 us to the worst at ~180 us (most of on spin_locks, the delay just stacking up, very similar to the performance on ext4). I hope this problem can justifies this series of patches. A big mutex in kernfs is really not nice. Due to this BIG LOCK, concurrency in kernfs is almost NONE, even though you do operations on different files, they are contentious. As we get more and more cores on normal machines and because sysfs provides such important information, this problem should be fix. So please reconsider accepting the patches. For the patches, there is a mutex_lock in kn->attr_mutex, as Tejun mentioned here (https://lore.kernel.org/lkml/X8fe0cmu+aq1gi7O@xxxxxxxxxxxxxxx/), maybe a global rwsem for kn->iattr will be better?? thanks, fox