On Thu 05-09-24 09:53:26, Theodore Ts'o wrote: > On Thu, Sep 05, 2024 at 01:26:50PM +0200, Michal Hocko wrote: > > > > > > This is exactly GFP_KERNEL semantic for low order allocations or > > > > > > kvmalloc for that matter. They simply never fail unless couple of corner > > > > > > cases - e.g. the allocating task is an oom victim and all of the oom > > > > > > memory reserves have been consumed. This is where we call "not possible > > > > > > to allocate". > > > > > > > > > > Which does beg the question of why GFP_NOFAIL exists. > > > > > > > > Exactly for the reason that even rare failure is not acceptable and > > > > there is no way to handle it other than keep retrying. Typical code was > > > > while (!(ptr = kmalloc())) > > > > ; > > > > > > But is it _rare_ failure, or _no_ failure? > > > > > > You seem to be saying (and I just reviewed the code, it looks like > > > you're right) that there is essentially no difference in behaviour > > > between GFP_KERNEL and GFP_NOFAIL. > > That may be the currrent state of affiars; but is it > ****guaranteed**** forever and ever, amen, that GFP_KERNEL will never > fail if the amount of memory allocated was lower than a particular > multiple of the page size? No, GFP_KERNEL is not guaranteed. Allocator tries as hard as it can to satisfy those allocations for order <= PAGE_ALLOC_COSTLY_ORDER. GFP_NOFAIL is guaranteed for order <= 1 for page allocator and there is no practical limit for vmalloc currently. This is what our documentation says * The default allocator behavior depends on the request size. We have a concept * of so-called costly allocations (with order > %PAGE_ALLOC_COSTLY_ORDER). * !costly allocations are too essential to fail so they are implicitly * non-failing by default (with some exceptions like OOM victims might fail so * the caller still has to check for failures) while costly requests try to be * not disruptive and back off even without invoking the OOM killer. * The following three modifiers might be used to override some of these * implicit rules. There is no guarantee this will be that way for ever. This is unlikely to change though. -- Michal Hocko SUSE Labs