On Thu, Dec 18, 2008 at 10:34:52PM -0800, Dave Hansen wrote: > On Fri, 2008-12-19 at 07:19 +0100, Nick Piggin wrote: > > Hi. Fun, chasing down performance regressions.... I wonder what people think > > about these patches? Is it OK to bloat struct vfsmount? Any races? > > Very cool stuff, Nick. I especially like how much it simplifies things > and removes *SO* much code. Thanks. > Bloating the vfsmount was one of the things that really, really tried to > avoid. When I start to think about the SGI machines, it gets me really > worried. I went to a lot of trouble to make sure that the per-vfsmount > memory overhead didn't scale with the number of cpus. Well, OTOH, the SGI machines have a lot of memory ;) I *think* that not many systems probably have thousands of mounts (given that the mount hashtable is fixed sized single page), but I might be wrong which is why I ask here. Let's say a 4096 CPU machine with one mount for each CPU (4096 mounts), I think should only use about 128MB total for the counters. OK, yes that is a lot ;) but not exactly insane for such machine size. Say for 32 CPU system with 10,000 mounts, it's 9MB. > > This could > > be made even faster if mnt_make_readonly could tolerate a really high latency > > synchronize_rcu()... can it?) > > Yes, I think it can tolerate it. There's a lot of work to do, and we > already have to go touch all the other per-cpu objects. There also > tends to be writeout when this happens, so I don't think a few seconds, > even, will be noticed. That would be good. After the first patch, mnt_want_write still shows up on profiles and almost oall the hits come right after the msync from the smp_mb there. It would be really nice to use RCU here. I think it might allow us to eliminate the memory barriers. > > This patch speeds up lmbench lat_mmap test by about 8%. lat_mmap is set up > > basically to mmap a 64MB file on tmpfs, fault in its pages, then unmap it. > > A microbenchmark yes, but it exercises some important paths in the mm. > > Do you know where the overhead actually came from? Was it the > spinlocks? Was removing all the atomic ops what really helped? I thnk about 95% of the unhalted cycles were hit against the two instructions after the call to spin_lock. It wasn't actually flipping the write counter per-cpu cache as far as I could see. I didn't save the instruction level profiles, but I'll do another run if people think it will be sane to use RCU here. > I'll take a more in-depth look at your code tomorrow and see if I see > any races. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html