On Thu, Dec 09, 2010 at 11:24:38PM +1100, Nick Piggin wrote: > On Thu, Dec 9, 2010 at 6:45 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > On Thu, Dec 09, 2010 at 05:16:44PM +1100, Nick Piggin wrote: > >> On Thu, Dec 09, 2010 at 04:43:43PM +1100, Dave Chinner wrote: > >> > On Mon, Nov 29, 2010 at 09:57:33PM +1100, Nick Piggin wrote: > >> > > Hey, > >> > > > >> > > What was the reason behind not using my approach to use fast per-cpu > >> > > counters for inode and dentry counters, and instead using the > >> > > percpu_counter lib (which is not useful unless very fast approximate > >> > > access to the global counter is required, or performance is not > >> > > critical, which is somewhat of an oxymoron if you're using per-counters > >> > > in the first place). It is a difference between this: > >> > > >> > Hi Nick - sorry for being slow to answer this - I only just found > >> > this email. > >> > > >> > The reason for using the generic counters is because the shrinkers > >> > read the current value of the global counter on every call and hence > >> > they can be read thousands of times a second. The only way to do that > >> > efficiently is to use the approximately value the generic counters > >> > provide. > >> > >> That is not what is happening, though, so I assume that no measurements > >> were done. > >> > >> In fact what happens now is that *both* type of counters use the crappy > >> percpu counter library, and the shrinkers actually do a per-cpu loop > >> over the counters to get the sum. > > > > More likely that the overhead was hidden in the noise on the size of > > machines most people test on. > > No. I was referring to the decision to use the heavyweight percpu_counter > code over the superior per cpu data that I was using. Your "superior" solution is only superior when you don't have to sum the counters regularly. I'll repeat what Andrew Morton said early one when your per-cpu counter approach was first discussed: If you think the generic percpu counters are too heavyweight, then _fix the generic counters_ rather than hack around them. That way everyone who uses the generic infrastructure benefits and it reduces the desire for every subsystem to roll their own specialised percpu counters... > Also, the unrelated change to make nr_unused into per-cpu was not > right, and I will revert that back to a global variable. (again, unless you > have numbers) What "nr_unused" variable? nr_dentrys_unused, nr_inodes_unused or some other variable? And, apart from the overhead, why is it wrong - does it give incorrect values? > > It certainly wasn't measurable on my > > 16p machine, and nobody who reviewed it at the time (Ñeveral people) > > picked it up. So thanks for reviewing it - the simple fix is below. > > > > Cheers, > > > > Dave. > > -- > > Dave Chinner > > david@xxxxxxxxxxxxx > > > > fs: Use approximate values for number of inodes and dentries > > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > Nack. Can you please address my points and actually explain why this > is better than my proposed approach please? FFS. What bit of "need to sum the counters thousands of times a second" don't you understand? Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html