On Fri, Nov 4, 2022 at 4:05 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Thu, 3 Nov 2022 17:14:07 +0000 Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > > > > > ... > > > > Thanks for the report. It seems like there is a race between > > for_each_online_cpu() in __percpu_counter_sum() and > > percpu_counter_cpu_dead()/cpu-offlining. Normally this race is fine for > > percpu_counter users but for check_mm() is not happy with this race. Can > > you please try the following patch: > > percpu-counters supposedly avoid such races via the hotplup notifier. > So can you please fully describe the race and let's see if it can be > fixed at the percpu_counter level? > Yes, I am writing a more detailed commit message explaining the race and why it is not really an issue for current users. > > > > From: Shakeel Butt <shakeelb@xxxxxxxxxx> > > Date: Thu, 3 Nov 2022 06:05:13 +0000 > > Subject: [PATCH] mm: percpu_counter: use race free percpu_counter sum > > interface > > > > percpu_counter_sum can race with cpu offlining. Add a new interface > > which does not race with it and use that for check_mm(). > > I'll grab this version for now, as others might be seeing this issue. > Thanks. > > > --- > > include/linux/percpu_counter.h | 11 +++++++++++ > > kernel/fork.c | 2 +- > > lib/percpu_counter.c | 24 ++++++++++++++++++------ > > 3 files changed, 30 insertions(+), 7 deletions(-) > > > > diff --git a/include/linux/percpu_counter.h b/include/linux/percpu_counter.h > > index bde6c4c1f405..3070c1043acf 100644 > > --- a/include/linux/percpu_counter.h > > +++ b/include/linux/percpu_counter.h > > @@ -45,6 +45,7 @@ void percpu_counter_set(struct percpu_counter *fbc, s64 amount); > > void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, > > s32 batch); > > s64 __percpu_counter_sum(struct percpu_counter *fbc); > > +s64 __percpu_counter_sum_all(struct percpu_counter *fbc); > > int __percpu_counter_compare(struct percpu_counter *fbc, s64 rhs, s32 batch); > > void percpu_counter_sync(struct percpu_counter *fbc); > > > > @@ -85,6 +86,11 @@ static inline s64 percpu_counter_sum(struct percpu_counter *fbc) > > return __percpu_counter_sum(fbc); > > } > > > > +static inline s64 percpu_counter_sum_all(struct percpu_counter *fbc) > > +{ > > + return __percpu_counter_sum_all(fbc); > > +} > > We haven't been good about documenting these interfaces. Can we please > start now? ;) > Yup will do. > > > > ... > > > > + > > +/* > > + * Add up all the per-cpu counts, return the result. This is a more accurate > > + * but much slower version of percpu_counter_read_positive() > > + */ > > +s64 __percpu_counter_sum(struct percpu_counter *fbc) > > +{ > > + return __percpu_counter_sum_mask(fbc, cpu_online_mask); > > +} > > EXPORT_SYMBOL(__percpu_counter_sum); > > > > +s64 __percpu_counter_sum_all(struct percpu_counter *fbc) > > +{ > > + return __percpu_counter_sum_mask(fbc, cpu_possible_mask); > > +} > > +EXPORT_SYMBOL(__percpu_counter_sum_all); > > Probably here is a good place to document it. > > Is there any point in having the > percpu_counter_sum_all()->__percpu_counter_sum_all() inlined wrapper? > Why not name this percpu_counter_sum_all() directly? > Ack. thanks, Shakeel