Re: [PATCH 09/24] rcu/tree: cache specified number of objects

"Paul E. McKenney" <paulmck@xxxxxxxxxx> · Mon, 4 May 2020 11:07:37 -0700

On Mon, May 04, 2020 at 07:48:22PM +0200, Uladzislau Rezki wrote:
> On Mon, May 04, 2020 at 08:24:37AM -0700, Paul E. McKenney wrote:
> > On Mon, May 04, 2020 at 02:43:23PM +0200, Uladzislau Rezki wrote:
> > > On Fri, May 01, 2020 at 02:27:49PM -0700, Paul E. McKenney wrote:
> > > > On Tue, Apr 28, 2020 at 10:58:48PM +0200, Uladzislau Rezki (Sony) wrote:
> > > > > Cache some extra objects per-CPU. During reclaim process
> > > > > some pages are cached instead of releasing by linking them
> > > > > into the list. Such approach provides O(1) access time to
> > > > > the cache.
> > > > > 
> > > > > That reduces number of requests to the page allocator, also
> > > > > that makes it more helpful if a low memory condition occurs.
> > > > > 
> > > > > A parameter reflecting the minimum allowed pages to be
> > > > > cached per one CPU is propagated via sysfs, it is read
> > > > > only, the name is "rcu_min_cached_objs".
> > > > > 
> > > > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx>
> > > > > ---
> > > > >  kernel/rcu/tree.c | 64 ++++++++++++++++++++++++++++++++++++++++++++---
> > > > >  1 file changed, 60 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > index 89e9ca3f4e3e..d8975819b1c9 100644
> > > > > --- a/kernel/rcu/tree.c
> > > > > +++ b/kernel/rcu/tree.c
> > > > > @@ -178,6 +178,14 @@ module_param(gp_init_delay, int, 0444);
> > > > >  static int gp_cleanup_delay;
> > > > >  module_param(gp_cleanup_delay, int, 0444);
> > > > >  
> > > > > +/*
> > > > > + * This rcu parameter is read-only, but can be write also.
> > > > 
> > > > You mean that although the parameter is read-only, you see no reason
> > > > why it could not be converted to writeable?
> > > > 
> > > I added just a note. If it is writable, then we can change the size of the
> > > per-CPU cache dynamically, i.e. "echo 5 > /sys/.../rcu_min_cached_objs"
> > > would cache 5 pages. But i do not have a strong opinion if it should be
> > > writable.
> > > 
> > > > If it was writeable, and a given CPU had the maximum numbr of cached
> > > > objects, the rcu_min_cached_objs value was decreased, but that CPU never
> > > > saw another kfree_rcu(), would the number of cached objects change?
> > > > 
> > > No. It works the way: unqueue the page from cache in the kfree_rcu(),
> > > whereas "rcu work" will put it back if number of objects < rcu_min_cached_objs,
> > > if >= will free the page.
> > 
> > Just to make sure I understand...  If someone writes a smaller number to
> > the sysfs variable, the per-CPU caches will be decreased at that point,
> > immediately during that sysfs write?  Or are you saying something else?
> > 
> This patch defines it as read-only. It defines the minimum threshold that
> controls number of elements in the per-CPU cache. If we decide to make it
> write also, then we will have full of freedom how to define its behavior,
> i.e. it is not defined because it is read only.

And runtime-read-only sounds like an excellent state for it.

> > > > Presumably the list can also be accessed without holding this lock,
> > > > because otherwise we shouldn't need llist...
> > > > 
> > > Hm... We increase the number of elements in cache, therefore it is not
> > > lockless. From the other hand i used llist_head to maintain the cache
> > > because it is single linked list, we do not need "*prev" link. Also
> > > we do not need to init the list.
> > > 
> > > But i can change it to list_head. Please let me know if i need :)
> > 
> > Hmmm...  Maybe it is time for a non-atomic singly linked list?  In the RCU
> > callback processing, the operations were open-coded, but they have been
> > pushed into include/linux/rcu_segcblist.h and kernel/rcu/rcu_segcblist.*.
> > 
> > Maybe some non-atomic/protected/whatever macros in the llist.h file?
> > Or maybe just open-code the singly linked list?  (Probably not the
> > best choice, though.)  Add comments stating that the atomic properties
> > of the llist functions aren't neded?  Something else?
> >
> In order to keep it simple i can replace llist_head by the list_head?

Fine by me!

> > The comments would be a good start.  Just to take pity on people seeing
> > the potential for concurrency and wondering how the concurrent accesses
> > actually happen.  ;-)
> > 
> Sounds like you are kidding me :) 

"Only those who have gone too far can possibly tell you how far you
can go!"  ;-)

							Thanx, Paul