[no subject]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Otherwise we would have to add synchronize_rcu(); after every single
> kmem_cache allocation which might be using RCU, and that would be
> terrible, no?

... if ext4 is not freeing memory blocks that might still be referenced
by RCU readers, then the SLAB_TYPESAFE_BY_RCU should be removed.
This "might still be referenced" is from the viewpoint of the code using
the allocator, not from that of the allocator itself.

So the typical RCU approach (not involving SLAB_TYPESAFE_BY_RCU)
is to take the grace period at the time of the free.  This can be
done synchronously using synchronize_rcu(), but is often instead done
asynchronously using call_rcu() or kfree_rcu().  So in this case,
you don't need synchronize_rcu() on allocation because the required
grace period already happened at *free() time.

But there are a few situations where it makes sense to free blocks that
readers might still be referencing.  Readers must then add validity
checks to detect this case, and also prevent freeing, for example,
using a per-block spinlock for synchronization.  For example, a reader
might acquire a spinlock in the block to prevent changes, recheck the
lookup key, and if the key does not match, release the lock and pretend
not to have found the block.  If the key does match, anything attempting
to delete and free the block will be spinning on that same spinlock.

And so if you specify SLAB_TYPESAFE_BY_RCU, the slab allocator is
guaranteeing type safety to RCU readers instead of the usual existence
guarantee.  A memory block might be freed out from under an RCU reader,
but its type will remain the same.  This means that the grace period
happens internally to the slab allocator when a slab is returned to
the system.

So either the validation checks are quite novel, the kmem_cache_zalloc()
calls should be replaced by kmem_cache_alloc() plus validation checks,
or the SLAB_TYPESAFE_BY_RCU should be removed.

Just out of curiosity, what is your mental model of SLAB_TYPESAFE_BY_RCU?

And yes, I did just up the visibility of this topic in my upcoming
presentation...

							Thanx, Paul

> > > > ---
> > > >  fs/jbd2/journal.c | 9 ++++++---
> > > >  1 file changed, 6 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> > > > index c2cf74b01ddb..323112de5921 100644
> > > > --- a/fs/jbd2/journal.c
> > > > +++ b/fs/jbd2/journal.c
> > > > @@ -2861,15 +2861,18 @@ static struct journal_head *journal_alloc_journal_head(void)
> > > >  #ifdef CONFIG_JBD2_DEBUG
> > > >  	atomic_inc(&nr_journal_heads);
> > > >  #endif
> > > > -	ret = kmem_cache_zalloc(jbd2_journal_head_cache, GFP_NOFS);
> > > > +	ret = kmem_cache_alloc(jbd2_journal_head_cache, GFP_NOFS);
> > > >  	if (!ret) {
> > > >  		jbd_debug(1, "out of memory for journal_head\n");
> > > >  		pr_notice_ratelimited("ENOMEM in %s, retrying.\n", __func__);
> > > > -		ret = kmem_cache_zalloc(jbd2_journal_head_cache,
> > > > +		ret = kmem_cache_alloc(jbd2_journal_head_cache,
> > > >  				GFP_NOFS | __GFP_NOFAIL);
> > > >  	}
> > > > -	if (ret)
> > > > +	if (ret) {
> > > > +		synchronize_rcu();
> > > > +		memset(ret, 0, sizeof(*ret));
> > > >  		spin_lock_init(&ret->b_state_lock);
> > > > +	}
> > > >  	return ret;
> > > >  }
> > > >  
> > > > -- 
> > > > 2.30.2
> > > > 
> > > -- 
> > > Jan Kara <jack@xxxxxxxx>
> > > SUSE Labs, CR



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux