On Tue, Mar 31, 2020 at 06:01:19PM +0200, Uladzislau Rezki wrote: > > > > Yes, I mean __GFP_MEMALLOC. Sorry, the patch was just to show the idea and > > marked as RFC. > > > > Good point on the atomic aspect of this path, you are right we cannot sleep. > > I believe the GFP_NOWAIT I mentioned in my last reply will take care of that? > > > I think there should be GFP_ATOMIC used, because it has more chance to > return memory then GFP_NOWAIT. I see that Michal has same view on it. I don't think so because GFP_ATOMIC implies GFP_NOWAIT. I am Ok with keeping the GFP_ATOMIC as it is btw. Paul mentioned he prefers this. I agree with that as well. > > > As for removing __GFP_NOWARN. Actually it is expectable that an > > > allocation can fail, if so we follow last emergency case. You > > > can see the trace but what would you do with that information? > > > > Yes, the benefit of the trace/warning is that the user can switch to a > > non-headless API and avoid the synchronize_rcu(), that would help them get > > faster kfree_rcu() performance instead of having silent slowdowns. > > > Agree. What about just adding WARN_ON_ONCE()? I am just thinking if it > could be harmful or not. You mean WARN_ON_ONCE() before the synchronize_rcu() right? We could do that. Paul mentioned to me he prefers if this new warning can be turned off with a boot parameter since some future user may prefer no warning. I also agree. If we add this then we can keep your __GFP_NOWARN flag with no additional GFP flag changes. > > It also tells us whether the headless API is worth it in the long run, I > > think it is worth it because we will likely never hit the synchronize_rcu() > > failsafe. But if we hit it a lot, at least it wont happen silently. > > > Agree. > > > Paul was concerned about following scenario with hitting synchronize_rcu(): > > 1. Consider a system under memory pressure. > > 2. Consider some other subsystem X depending on another system Y which uses > > kfree_rcu(). If Y doesn't complete the operation in time, X accumulates > > more memory. > > 3. Since kfree_rcu() on Y hits synchronize_rcu() a lot, it slows it down. > > This causes X to further allocate memory, further causing a chain > > reaction. > > Paul, please correct me if I'm wrong. > > > I see your point and agree that in theory it can happen. So, we should > make it more tight when it comes to rcu_head attachment logic. Right. Per discussion with Paul, we discussed that it is better if we pre-allocate N number of array blocks per-CPU and use it for the cache. Default for N being 1 and tunable with a boot parameter. I agree with this. In current code, we have 1 cache page per CPU, but this is allocated only on the first kvfree_rcu() request. So we could change this behavior as well to make it pre-allocated. Does this all sound good to you? thanks, - Joel