On Wed 13-10-21 13:32:31, Dave Chinner wrote: > On Mon, Oct 11, 2021 at 01:57:36PM +0200, Michal Hocko wrote: > > On Sat 09-10-21 09:36:49, Dave Chinner wrote: > > > On Fri, Oct 08, 2021 at 09:48:39AM +0200, Michal Hocko wrote: > > > > > > > Even the API constaints of kvmalloc() w.r.t. only doing the vmalloc > > > > > > > fallback if the gfp context is GFP_KERNEL - we already do GFP_NOFS > > > > > > > kvmalloc via memalloc_nofs_save/restore(), so this behavioural > > > > > > > restriction w.r.t. gfp flags just makes no sense at all. > > > > > > > > > > > > GFP_NOFS (without using the scope API) has the same problem as NOFAIL in > > > > > > the vmalloc. Hence it is not supported. If you use the scope API then > > > > > > you can GFP_KERNEL for kvmalloc. This is clumsy but I am not sure how to > > > > > > define these conditions in a more sensible way. Special case NOFS if the > > > > > > scope api is in use? Why do you want an explicit NOFS then? > > > > > > Exactly my point - this is clumsy and a total mess. I'm not asking > > > for an explicit GFP_NOFS, just pointing out that the documented > > > restrictions that "vmalloc can only do GFP_KERNEL allocations" is > > > completely wrong. > > > > > > vmalloc() > > > { > > > if (!(gfp_flags & __GFP_FS)) > > > memalloc_nofs_save(); > > > p = __vmalloc(gfp_flags | GFP_KERNEL) > > > if (!(gfp_flags & __GFP_FS)) > > > memalloc_nofs_restore(); > > > } > > > > > > Yup, that's how simple it is to support GFP_NOFS support in > > > vmalloc(). > > > > Yes, this would work from the functionality POV but it defeats the > > philosophy behind the scope API. Why would you even need this if the > > scope was defined by the caller of the allocator? > > Who actually cares that vmalloc might be using the scoped API > internally to implement GFP_NOFS or GFP_NOIO? Nobody at all. > It is far more useful (and self documenting!) for one-off allocations > to pass a GFP_NOFS flag than it is to use a scope API... I would agree with you if the explicit GFP_NOFS usage was consistent and actually justified in the majority cases. My experience tells me otherwise though. Many filesystems use the flag just because that is easier. That leads to a huge overuse of the flag that leads to practical problems. I was hoping that if we offer an API that would define problematic reclaim recursion scopes then it would reduce the abuse. I haven't expected this to happen overnight but it is few years and it seems it will not happen soon either. [...] > > > It also points out that the scope API is highly deficient. > > > We can do GFP_NOFS via the scope API, but we can't > > > do anything else because *there is no scope API for other GFP > > > flags*. > > > > > > Why don't we have a GFP_NOFAIL/__GFP_RETRY_FOREVER scope API? > > > > NO{FS,IO} where first flags to start this approach. And I have to admit > > the experiment was much less successful then I hoped for. There are > > still thousands of direct NOFS users so for some reason defining scopes > > is not an easy thing to do. > > > > I am not against NOFAIL scopes in principle but seeing the nofs > > "success" I am worried this will not go really well either and it is > > much more tricky as NOFAIL has much stronger requirements than NOFS. > > Just imagine how tricky this can be if you just call a library code > > that is not under your control within a NOFAIL scope. What if that > > library code decides to allocate (e.g. printk that would attempt to do > > an optimistic NOWAIT allocation). > > I already asked you that _exact_ question earlier in the thread > w.r.t. kvmalloc(GFP_NOFAIL) using optimistic NOWAIT kmalloc > allocation. I asked you as a MM expert to define *and document* the > behaviour that should result, not turn around and use the fact that > it is undefined behaviour as a "this is too hard" excuse for not > changing anything. Dave, you have "thrown" a lot of complains in previous emails and it is hard to tell rants from features requests apart. I am sorry but I believe it would be much more productive to continue this discussion if you could mild your tone. Can I ask you to break down your feature requests into separate emails so that we can discuss and track them separately rather in this quite a long thread which has IMHO diverghed from the initial topic. Thanks! > THe fact is that the scope APIs are only really useful for certain > contexts where restrictions are set by higher level functionality. > For one-off allocation constraints the API sucks and we end up with Could you be more specific about these one-off allocation constrains? What would be the reason to define one-off NO{FS,IO} allocation constrain? Or did you have your NOFAIL example in mind? > crap like this (found in btrfs): > > /* > * We're holding a transaction handle, so use a NOFS memory > * allocation context to avoid deadlock if reclaim happens. > */ > nofs_flag = memalloc_nofs_save(); > value = kmalloc(size, GFP_KERNEL); > memalloc_nofs_restore(nofs_flag); Yes this looks wrong indeed! If I were to review such a code I would ask why the scope cannot match the transaction handle context. IIRC jbd does that. I am aware of these patterns. I was pulled in some discussions in the past and in some it turned out that the constrain is not needed at all and in some cases that has led to a proper scope definition. As you point out in your other examples it just happens that it is easier to go an easy path and define scopes ad-hoc to work around allocation API limitations. [...] > IOWs, a large number of the users of the scope API simply make > [k]vmalloc() provide GFP_NOFS behaviour. ceph_kvmalloc() is pretty > much a wrapper that indicates how all vmalloc functions should > behave. Honour GFP_NOFS and GFP_NOIO by using the scope API > internally. I was discouraging from this behavior at vmalloc level to push people to use scopes properly - aka at the level where the reclaim recursion is really a problem. If that is infeasible in practice then we can re-evaluate of course. I was really hoping we can get rid of cargo cult GFP_NOFS usage this way but the reality often disagrees with hopes. All that being said, let's discuss [k]vmalloc constrains and usecases that need changes in a separate email thread. Thanks! -- Michal Hocko SUSE Labs