On Sun, Aug 11, 2019 at 04:49:39PM -0700, Paul E. McKenney wrote: > Maybe. Note well that I said "potential issue". When I checked a few > years ago, none of the uses of rcu_barrier() cared about kfree_rcu(). > They cared instead about call_rcu() callbacks that accessed code or data > that was going to disappear soon, for example, due to module unload or > filesystem unmount. > > So it -might- be that rcu_barrier() can stay as it is, but with changes > as needed to documentation. > > It also -might- be, maybe now or maybe some time in the future, that > there will need to be a kfree_rcu_barrier() or some such. But if so, > let's not create it until it is needed. For one thing, it is reasonably > likely that something other than a kfree_rcu_barrier() would really > be what was needed. After all, the main point would be to make sure > that the old memory really was freed before allocating new memory. Now I fully understand what you meant thanks to you. Thank you for explaining it in detail. > But if the system had ample memory, why wait? In that case you don't > really need to wait for all the old memory to be freed, but rather for > sufficient memory to be available for allocation. Agree. Totally make sense. Thanks, Byungchul > > Thanx, Paul > > > Thanks, > > Byungchul > > > > > But now that we can see letting the list just grow works well, we don't > > > have to consider this one at the moment. Let's consider this method > > > again once we face the problem in the future by any chance. > > > > > > > We should therefore just let the second list grow. If experience shows > > > > a need for callbacks to be sent up more quickly, it should be possible > > > > to provide an additional list, so that two lists on a given CPU can both > > > > be waiting for a grace period at the same time. > > > > > > Or the third and fourth list might be needed in some system. But let's > > > talk about it later too. > > > > > > > > > I also agree. But this _FORCE thing will still not solve the issue Paul is > > > > > > raising which is doing this loop possibly in irq disabled / hardirq context. > > > > > > > > > > I added more explanation above. What I suggested is a way to avoid not > > > > > only heavy > > > > > work within the irq-disabled region of a single kfree_rcu() but also > > > > > too many requests > > > > > to be queued into ->head. > > > > > > > > But let's start simple, please! > > > > > > Yes. The simpler, the better. > > > > > > > > > We can't even cond_resched() here. In fact since _FORCE is larger, it will be > > > > > > even worse. Consider a real-time system with a lot of memory, in this case > > > > > > letting ->head grow large is Ok, but looping for long time in IRQ disabled > > > > > > would not be Ok. > > > > > > > > > > Please check the explanation above. > > > > > > > > > > > But I could make it something like: > > > > > > 1. Letting ->head grow if ->head_free busy > > > > > > 2. If head_free is busy, then just queue/requeue the monitor to try again. > > > > > > > > > > This is exactly what Paul said. The problem with this is ->head can grow too > > > > > much. That's why I suggested the above one. > > > > > > > > It can grow quite large, but how do you know that limiting its size will > > > > really help? Sure, you have limited the size, but does that really do > > > > > > To decide the size, we might have to refer to how much pressure on > > > memory and RCU there are at that moment and adjust it on runtime. > > > > > > > anything for the larger problem of extreme kfree_rcu() rates on the one > > > > hand and a desire for more efficient handling of kfree_rcu() on the other? > > > > > > Assuming current RCU logic handles extremly high rate well which is > > > anyway true, my answer is *yes*, because batching anyway has pros and > > > cons. One of major cons is there must be inevitable kfree_rcu() requests > > > that not even request to RCU. By allowing only the size of batching, the > > > situation can be mitigated. > > > > > > I just answered to you. But again, let's talk about it later once we > > > face the problem as you said. > > > > > > Thanks, > > > Byungchul > > > > > > > Thanx, Paul > > > > > > > > > > This would even improve performance, but will still risk going out of memory. > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > thanks, > > > > > > > > > > > > - Joel > > > > > > > > > > > > > > > > > > > > This way, we can avoid both: > > > > > > > > > > > > > > 1. too many requests being queued and > > > > > > > 2. __call_rcu() bunch of requests within a single kfree_rcu(). > > > > > > > > > > > > > > Thanks, > > > > > > > Byungchul > > > > > > > > > > > > > > > > > > > > > > > But please feel free to come up with a better solution! > > > > > > > > > > > > > > > > [ . . . ] > > > > > > > > > > > > > > > > > > > > -- > > > > > Thanks, > > > > > Byungchul > > > > >