On 04/09/2013 03:26 AM, Dave Chinner wrote: > On Mon, Apr 08, 2013 at 01:14:48PM +0400, Glauber Costa wrote: >> On 04/05/2013 05:15 AM, Dave Chinner wrote: >>> On Thu, Apr 04, 2013 at 06:09:31PM -0700, Greg Thelen wrote: >>>> On Fri, Mar 29 2013, Glauber Costa wrote: >>>> >>>>> From: Dave Chinner <dchinner@xxxxxxxxxx> >>>>> >>>>> Before we split up the dcache_lru_lock, the unused dentry counter >>>>> needs to be made independent of the global dcache_lru_lock. Convert >>>>> it to per-cpu counters to do this. >>>>> >>>>> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> >>>>> Reviewed-by: Christoph Hellwig <hch@xxxxxx> >>>>> --- >>>>> fs/dcache.c | 17 ++++++++++++++--- >>>>> 1 file changed, 14 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/fs/dcache.c b/fs/dcache.c >>>>> index fbfae008..f1196f2 100644 >>>>> --- a/fs/dcache.c >>>>> +++ b/fs/dcache.c >>>>> @@ -118,6 +118,7 @@ struct dentry_stat_t dentry_stat = { >>>>> }; >>>>> >>>>> static DEFINE_PER_CPU(unsigned int, nr_dentry); >>>>> +static DEFINE_PER_CPU(unsigned int, nr_dentry_unused); >>>>> >>>>> #if defined(CONFIG_SYSCTL) && defined(CONFIG_PROC_FS) >>>>> static int get_nr_dentry(void) >>>>> @@ -129,10 +130,20 @@ static int get_nr_dentry(void) >>>>> return sum < 0 ? 0 : sum; >>>>> } >>>>> >>>>> +static int get_nr_dentry_unused(void) >>>>> +{ >>>>> + int i; >>>>> + int sum = 0; >>>>> + for_each_possible_cpu(i) >>>>> + sum += per_cpu(nr_dentry_unused, i); >>>>> + return sum < 0 ? 0 : sum; >>>>> +} >>>> >>>> Just checking... If cpu x is removed, then its per cpu nr_dentry_unused >>>> count survives so we don't leak nr_dentry_unused. Right? I see code in >>>> percpu_counter_sum_positive() to explicitly handle this case and I want >>>> to make sure we don't need it here. >>> >>> DEFINE_PER_CPU() gives a variable per possible CPU, and we sum for >>> all possible CPUs. Therefore online/offline CPUs just don't matter. >>> >>> The percpu_counter code uses for_each_online_cpu(), and so it has to >>> be aware of hotplug operations so taht it doesn't leak counts. >> >> It is an unsigned quantity, however. Can't we go negative if it becomes >> unused in one cpu, but used in another? > > Sure, but it's unsigned for the purposes of summing, not for the > purposes of having pos/neg values - they are just delta counters. > > I'm just copying the code from fs/inode.c. I originally implemented > the fs/inode.c code using generic per-cpu counters, but there was > a hissy fit over "too much overhead" and so someone implemented > their own lightweight version. I've just copied the existing code to > code because I don't care to revisit this.... > Funny enough, we re implement per-cpu counters in memcg as well. This is mostly overhead/counters cache layout related. Maybe it is time for a better percpu counter ? (not that I have the time for it...) >> Ex: >> >> nr_unused/0: 0 >> nr_unused/1: 0 >> >> dentry goes to the LRU at cpu 1: >> nr_unused/0: 0 >> nr_unused/1: 1 >> >> CPU 1 goes down: >> nr_unused/0: 0 > > why? > >> dentry goes out of the LRU at cpu 0: >> nr_unused/0: 1 << 32. > > Sorry, where does that shift come from? Pulling from the LRU is just > a simple subtraction. (i.e. 0 - 1 = 0xffffffff), and so > when we sum them all up: > > nr_unused/0: 1 > nr_unused/0: -1 (0xffffffff) > > sum = 1 + 0xffffffff = 0 > >> That would easily be fixed by using a normal signed long, and is in fact >> what the percpu code does in its internal operations. > > Changing it to a long means it becomes at 64 bit value on 64 bit > machines (doubling memory usage), and now you're summing a 64 bit > values into a 32 bit integer. Something else to go wrong.... > >> Any reason not to do it? Something I am not seeing? > > It's a direct copy of the counting code in fs/inode.c. That has not > demonstrated any problems in all my monitoring for the past coupl > eof years (these are userspace visible stats) so AFAICT this code > is just fine... > Well, in this case I can revert that. My main concern is us being caught by overflows and stuff like that. But I trust the "it's working for x years" more than my eyes. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html