Re: [PATCH] fs: use approximate counter values for inodes and dentries. (was Re: [patch] fs: use fast counters for vfs caches)

Dave Chinner <david@xxxxxxxxxxxxx> · Fri, 10 Dec 2010 10:30:28 +1100

On Thu, Dec 09, 2010 at 11:24:38PM +1100, Nick Piggin wrote:
> On Thu, Dec 9, 2010 at 6:45 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> > On Thu, Dec 09, 2010 at 05:16:44PM +1100, Nick Piggin wrote:
> >> On Thu, Dec 09, 2010 at 04:43:43PM +1100, Dave Chinner wrote:
> >> > On Mon, Nov 29, 2010 at 09:57:33PM +1100, Nick Piggin wrote:
> >> > > Hey,
> >> > >
> >> > > What was the reason behind not using my approach to use fast per-cpu
> >> > > counters for inode and dentry counters, and instead using the
> >> > > percpu_counter lib (which is not useful unless very fast approximate
> >> > > access to the global counter is required, or performance is not
> >> > > critical, which is somewhat of an oxymoron if you're using per-counters
> >> > > in the first place). It is a difference between this:
> >> >
> >> > Hi Nick - sorry for being slow to answer this - I only just found
> >> > this email.
> >> >
> >> > The reason for using the generic counters is because the shrinkers
> >> > read the current value of the global counter on every call and hence
> >> > they can be read thousands of times a second. The only way to do that
> >> > efficiently is to use the approximately value the generic counters
> >> > provide.
> >>
> >> That is not what is happening, though, so I assume that no measurements
> >> were done.
> >>
> >> In fact what happens now is that *both* type of counters use the crappy
> >> percpu counter library, and the shrinkers actually do a per-cpu loop
> >> over the counters to get the sum.
> >
> > More likely that the overhead was hidden in the noise on the size of
> > machines most people test on.
> 
> No. I was referring to the decision to use the heavyweight percpu_counter
> code over the superior per cpu data that I was using.

Your "superior" solution is only superior when you don't have to sum
the counters regularly.

I'll repeat what Andrew Morton said early one when your per-cpu
counter approach was first discussed: If you think the generic
percpu counters are too heavyweight, then _fix the generic counters_
rather than hack around them. That way everyone who uses the generic
infrastructure benefits and it reduces the desire for every subsystem
to roll their own specialised percpu counters...

> Also, the unrelated change to make nr_unused into per-cpu was not
> right, and I will revert that back to a global variable. (again, unless you
> have numbers)

What "nr_unused" variable? nr_dentrys_unused, nr_inodes_unused or
some other variable? And, apart from the overhead, why is it wrong -
does it give incorrect values?

> > It certainly wasn't measurable on my
> > 16p machine, and nobody who reviewed it at the time (Ñeveral people)
> > picked it up. So thanks for reviewing it - the simple fix is below.
> >
> > Cheers,
> >
> > Dave.
> > --
> > Dave Chinner
> > david@xxxxxxxxxxxxx
> >
> > fs: Use approximate values for number of inodes and dentries
> >
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> Nack. Can you please address my points and actually explain why this
> is better than my proposed approach please?

FFS. What bit of "need to sum the counters thousands of times a
second" don't you understand?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html