Re: [PATCH 1/2] percpu_ref: add percpu_ref_atomic_count()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 16, 2021 at 02:37:03PM +0000, Dennis Zhou wrote:
> On Fri, Apr 16, 2021 at 10:10:07PM +0800, Ming Lei wrote:
> > On Fri, Apr 16, 2021 at 02:16:41PM +0100, Pavel Begunkov wrote:
> > > On 16/04/2021 05:45, Dennis Zhou wrote:
> > > > Hello,
> > > > 
> > > > On Fri, Apr 16, 2021 at 01:22:51AM +0100, Pavel Begunkov wrote:
> > > >> Add percpu_ref_atomic_count(), which returns number of references of a
> > > >> percpu_ref switched prior into atomic mode, so the caller is responsible
> > > >> to make sure it's in the right mode.
> > > >>
> > > >> Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
> > > >> ---
> > > >>  include/linux/percpu-refcount.h |  1 +
> > > >>  lib/percpu-refcount.c           | 26 ++++++++++++++++++++++++++
> > > >>  2 files changed, 27 insertions(+)
> > > >>
> > > >> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> > > >> index 16c35a728b4c..0ff40e79efa2 100644
> > > >> --- a/include/linux/percpu-refcount.h
> > > >> +++ b/include/linux/percpu-refcount.h
> > > >> @@ -131,6 +131,7 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
> > > >>  void percpu_ref_resurrect(struct percpu_ref *ref);
> > > >>  void percpu_ref_reinit(struct percpu_ref *ref);
> > > >>  bool percpu_ref_is_zero(struct percpu_ref *ref);
> > > >> +unsigned long percpu_ref_atomic_count(struct percpu_ref *ref);
> > > >>  
> > > >>  /**
> > > >>   * percpu_ref_kill - drop the initial ref
> > > >> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> > > >> index a1071cdefb5a..56286995e2b8 100644
> > > >> --- a/lib/percpu-refcount.c
> > > >> +++ b/lib/percpu-refcount.c
> > > >> @@ -425,6 +425,32 @@ bool percpu_ref_is_zero(struct percpu_ref *ref)
> > > >>  }
> > > >>  EXPORT_SYMBOL_GPL(percpu_ref_is_zero);
> > > >>  
> > > >> +/**
> > > >> + * percpu_ref_atomic_count - returns number of left references
> > > >> + * @ref: percpu_ref to test
> > > >> + *
> > > >> + * This function is safe to call as long as @ref is switch into atomic mode,
> > > >> + * and is between init and exit.
> > > >> + */
> > > >> +unsigned long percpu_ref_atomic_count(struct percpu_ref *ref)
> > > >> +{
> > > >> +	unsigned long __percpu *percpu_count;
> > > >> +	unsigned long count, flags;
> > > >> +
> > > >> +	if (WARN_ON_ONCE(__ref_is_percpu(ref, &percpu_count)))
> > > >> +		return -1UL;
> > > >> +
> > > >> +	/* protect us from being destroyed */
> > > >> +	spin_lock_irqsave(&percpu_ref_switch_lock, flags);
> > > >> +	if (ref->data)
> > > >> +		count = atomic_long_read(&ref->data->count);
> > > >> +	else
> > > >> +		count = ref->percpu_count_ptr >> __PERCPU_REF_FLAG_BITS;
> > > > 
> > > > Sorry I missed Jens' patch before and also the update to percpu_ref.
> > > > However, I feel like I'm missing something. This isn't entirely related
> > > > to your patch, but I'm not following why percpu_count_ptr stores the
> > > > excess count of an exited percpu_ref and doesn't warn when it's not
> > > > zero. It seems like this should be an error if it's not 0?
> > > > 
> > > > Granted we have made some contract with the user to do the right thing,
> > > > but say someone does mess up, we don't indicate to them hey this ref is
> > > > actually dead and if they're waiting for it to go to 0, it never will.
> > > 
> > > fwiw, I copied is_zero, but skimming through the code don't immediately
> > > see myself why it is so...
> > > 
> > > Cc Ming, he split out some parts of it to dynamic allocation not too
> > > long ago, maybe he knows the trick.
> > 
> > I remembered that percpu_ref_is_zero() can be called even after percpu_ref_exit()
> > returns, and looks percpu_ref_is_zero() isn't classified into 'active use'.
> > 
> 
> Looking at the commit prior, it seems like percpu_ref_is_zero() was
> subject to the usual init and exit lifetime. I guess I'm just not
> convinced it should ever be > 0. I'll think about it a little longer and
> might fix it.

There may not be > 0 at that time, but it was allowed for
percpu_ref_is_zero() to read un-initialized refcount, and there was
such kernel oops report:

https://lore.kernel.org/lkml/165db20c-bfc5-fca8-1ecf-45d85ea5d9e2@xxxxxxxxx/#r




Thanks, 
Ming




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux