RE: question about kref API

Jeff Haran <Jeff.Haran@xxxxxxxxxx> · Wed, 23 Jul 2014 17:38:47 +0000

> -----Original Message-----
> From: kernelnewbies-bounces@xxxxxxxxxxxxxxxxx [mailto:kernelnewbies-
> bounces@xxxxxxxxxxxxxxxxx] On Behalf Of Jeff Haran
> Sent: Wednesday, July 23, 2014 10:33 AM
> To: 'Greg KH'
> Cc: kernelnewbies@xxxxxxxxxxxxxxxxx
> Subject: RE: question about kref API
> 
> > -----Original Message-----
> > From: Greg KH [mailto:greg@xxxxxxxxx]
> > Sent: Tuesday, July 22, 2014 4:47 PM
> > To: Jeff Haran
> > Cc: kernelnewbies@xxxxxxxxxxxxxxxxx
> > Subject: Re: question about kref API
> >
> > On Tue, Jul 22, 2014 at 05:25:03PM +0000, Jeff Haran wrote:
> > > > -----Original Message-----
> > > > From: Greg KH [mailto:greg@xxxxxxxxx]
> > > > Sent: Monday, July 21, 2014 7:18 PM
> > > > To: Jeff Haran
> > > > Cc: kernelnewbies@xxxxxxxxxxxxxxxxx
> > > > Subject: Re: question about kref API
> > > >
> > > > On Tue, Jul 22, 2014 at 12:27:20AM +0000, Jeff Haran wrote:
> > > > > Hi,
> > > > >
> > > > > I've been reading Documentation/kref.txt in order to understand
> > > > > the usage of this API. There is part of this documentation that
> > > > > I am having difficulty understanding and was hoping somebody on
> > > > > this list could clarify this. One of my assumptions in reading
> > > > > this is that the expected usage of this API is that for a given
> > > > > object embedding a kref, once the object has been initialized
> > > > > the number of calls to "put" a given instance of an object
> > > > > should never exceed the number of calls to "get" that same instance.
> > > >
> > > > If it does, the object will be cleaned up and deleted from the
> > > > system, so you no longer have a valid pointer.
> > > >
> > > > > Maybe that's the root of my misunderstanding, but my assumption
> > > > > is that calls to kref_get() and kref_put() are expected to be
> > > > > made in pairs for a given instance of struct kref. If this is
> > > > > wrong, please let me know.
> > > >
> > > > "pairs" in that the same number must be called in order for things
> > > > to work properly.
> > > >
> > > > > Kref.txt includes some sample code that discusses using a mutex
> > > > > to serialize the execution of its get_entry() and put_entry() functions:
> > > > >
> > > > > 146 static DEFINE_MUTEX(mutex);
> > > > > 147 static LIST_HEAD(q);
> > > > > 148 struct my_data
> > > > > 149 {
> > > > > 150         struct kref      refcount;
> > > > > 151         struct list_head link;
> > > > > 152 };
> > > > > 153
> > > > > 154 static struct my_data *get_entry()
> > > > > 155 {
> > > > > 156         struct my_data *entry = NULL;
> > > > > 157         mutex_lock(&mutex);
> > > > > 158         if (!list_empty(&q)) {
> > > > > 159                 entry = container_of(q.next, struct my_data, link);
> > > > > 160                 kref_get(&entry->refcount);
> > > > > 161         }
> > > > > 162         mutex_unlock(&mutex);
> > > > > 163         return entry;
> > > > > 164 }
> > > > > 165
> > > > > 166 static void release_entry(struct kref *ref)
> > > > > 167 {
> > > > > 168         struct my_data *entry = container_of(ref, struct my_data,
> refcount);
> > > > > 169
> > > > > 170         list_del(&entry->link);
> > > > > 171         kfree(entry);
> > > > > 172 }
> > > > > 173
> > > > > 174 static void put_entry(struct my_data *entry)
> > > > > 175 {
> > > > > 176         mutex_lock(&mutex);
> > > > > 177         kref_put(&entry->refcount, release_entry);
> > > > > 178         mutex_unlock(&mutex);
> > > > > 179 }
> > > > >
> > > > > The sample code does not show the creation of the link list
> > > > > headed by q,
> > > >
> > > > That is done there in the static initializer.
> >
> > [meta comment, please properly wrap your lines...]

Apologies on the long lines of my most recent post.

While composing it, I terminated each line with a return at short line lengths.

But then Outlook crams them all back together again when I hit send.

I'll do better next time.

> >
> > > You are referring to line 147 here, right? That creates an empty
> > > list if I am following the code correctly.
> >
> > Yes.
> >
> > > What I meant was that the sample code doesn't show how instances of
> > > struct my_data get initialized and inserted into the list at q.
> > > Makes sense to leave that out for brevity I suppose and you've made
> > > it clear below that for this to work right a call to kref_init()
> > > must have been made on the refcount fields of any struct my_data
> > > instances that got put into the list headed by q.  Thanks.
> >
> > Yes, that code is left as an exercise for the reader.  Or you can just
> > look at the kernel where it is used in many places directly :)
> >
> > > At this point it's rule (3) that I am still struggling with a bit:
> > >
> > >  50 3) If the code attempts to gain a reference to a kref-ed structure
> > >  51    without already holding a valid pointer, it must serialize access
> > >  52    where a kref_put() cannot occur during the kref_get(), and the
> > >  53    structure must remain valid during the kref_get().
> > >
> > > In this example, every call to put_entry() results in locking and
> > > unlocking the mutex. But if I am following this right, that is only
> > > because the entry at the head of the list is removed from the list
> > > when and only when the last reference to it is released.
> >
> > It has nothing to do with a list, don't get hung up on a list with a
> > kref, it's not needed, just used as an example here.
> >
> > > If the list_del() happened for some other cause (say a timer expired
> > > or user space sent a message to delete the entry), then the taking
> > > of the mutex in put_entry() wouldn't be necessary, right?
> >
> > No, it still would be.
> >
> > > For instance, this would be valid, wouldn't it?
> > >
> > > static void release_entry(struct kref *ref) {
> > > 	struct my_data *entry = container_of(ref, struct my_data,
> > > refcount);
> > >
> > > 	kfree(entry);
> > > }
> > >
> > > static void put_entry(struct my_data *entry) {
> > > 	kref_put(&entry->refcount, release_entry); }
> > >
> > > static void del_entry(struct my_data *entry) {
> > > 	mutex_lock(&mutex);
> > > 	list_del(&entry->link);
> > > 	mutex_unlock(&mutex);
> > > 	put_entry(entry);
> > > }
> >
> > Nope.  Don't get list entries confused with an object that a kref
> > happens to maintain.  You still need to lock the kref itself, two
> > different people could be calling put_entry() at the same time, right?
> 
> Ok, so that's an issue I don't understand. Looking at what kref_put() does, I
> don't see why any external locking would be required to serialize concurrent
> callers to put_entry(). As I am sure you are aware,
> kref_put() ends up being a call to this, where the count passed in is 1:
> 
> 68 static inline int kref_sub(struct kref *kref, unsigned int count,
>  69              void (*release)(struct kref *kref))
>  70 {
>  71         WARN_ON(release == NULL);
>  72
>  73         if (atomic_sub_and_test((int) count, &kref->refcount)) {
>  74                 release(kref);
>  75                 return 1;
>  76         }
>  77         return 0;
>  78 }
> 
> atomic_sub_and_test() is in of itself thread safe, at least that's the way I read
> its header comment block. In this case the refcount field is atomically
> decremented by 1 and if the result of that decrementation is 0, it returns TRUE
> otherwise FALSE. That means that if there are N concurrent callers to
> kref_put() when refcount initially contains N, atomic_sub_and_test() will return
> TRUE for 1 and only 1 of those callers. All of the other callers will see
> atomic_sub_and_test() return FALSE and simply return 0. That means the
> release function can get called only once. And since that will only happen when
> refcount has gotten down to 0, there will be no other existing references to the
> kref as all the other callers must have already proceeded far enough so that
> their calls to atomic_sub_and_test() have completed. They have nothing left to
> do but return 0 to their callers and if those callers are following the rules they
> will make no further references to the entry. So I don't see how any sort of race
> condition is possible among concurrent callers to put_entry() here.
> 
> There also doesn't seem to be any sort of race possible between concurrent
> callers to put_entry() and get_entry(). If put_entry() finds that the reference
> count has gotten down to zero as part of its call to atomic_sub_and_test(), then
> the entry can't be in the list so a concurrent caller to get_entry() will either see
> that the list empty and never take a new reference or it will take a reference to
> the next entry in the list.
> 
> My apologies for seeming to rat hole on this, but I must be missing something
> fundamental here since I really don't see any race conditions for your proposed
> put side locking to prevent. The usage of the atomic refcount seem to be
> sufficient in this case. That's the beauty of the API, or at least that's what I
> thought was the beauty of it. :)
> 
> Clearly there are potential performance benefits in not needing to take locks or
> mutexes when they are not necessary. If you could elaborate on where the race
> condition is here, I think you'd being doing both me and the community a great
> service.
> 
> Thanks again,
> 
> Jeff Haran
> 
> > > In this example, threads that want access to an entry do need to
> > > take the mutex in order to get it from the list in get_entry(), but
> > > when they are done with it but don't want it deleted from the list
> > > they can call put_entry() and no mutex need be taken. The only time
> > > the mutex need be taken is when the caller wants to also delete the
> > > entry from the list, which is what del_entry() is for.
> > >
> > > Put another way, the mutex is really there to serialize access to
> > > the list, right?
> >
> > Again, don't focus on the list, and you should be fine.
> >
> > Do you have a real-world use of kref that you are curious about?  That
> > might help out more here.
> >
> > thanks,
> >
> > greg k-h
> 
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies@xxxxxxxxxxxxxxxxx
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies