Re: [PATCH 0/2 v2] remove PF_MEMALLOC_NORECLAIM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 05, 2024 at 09:53:26AM GMT, Theodore Ts'o wrote:
> On Thu, Sep 05, 2024 at 01:26:50PM +0200, Michal Hocko wrote:
> > > > > > This is exactly GFP_KERNEL semantic for low order allocations or
> > > > > > kvmalloc for that matter. They simply never fail unless couple of corner
> > > > > > cases - e.g. the allocating task is an oom victim and all of the oom
> > > > > > memory reserves have been consumed. This is where we call "not possible
> > > > > > to allocate".
> > > > > 
> > > > > Which does beg the question of why GFP_NOFAIL exists.
> > > > 
> > > > Exactly for the reason that even rare failure is not acceptable and
> > > > there is no way to handle it other than keep retrying. Typical code was 
> > > > 	while (!(ptr = kmalloc()))
> > > > 		;
> > > 
> > > But is it _rare_ failure, or _no_ failure?
> > >
> > > You seem to be saying (and I just reviewed the code, it looks like
> > > you're right) that there is essentially no difference in behaviour
> > > between GFP_KERNEL and GFP_NOFAIL.
> 
> That may be the currrent state of affiars; but is it
> ****guaranteed**** forever and ever, amen, that GFP_KERNEL will never
> fail if the amount of memory allocated was lower than a particular
> multiple of the page size?  If so, what is that size?  I've checked,
> and this is not documented in the formal interface.

Yeah, and I think we really need to make that happen, in order to head
off a lot more sillyness in the future.

We'd also be documenting at the same time _exactly_ when it is required
to check for errors:
- small, fixed sized allocation in a known sleepable context, safe to skip
- anything else, i.e. variable sized allocation or library code that can
  be called from different contexts: you check for errors (and probably
  that's just "something crazy has happened, emergency shutdown" for the
  xfs/ext4 paths

> > The fundamental difference is that (appart from unsupported allocation
> > mode/size) the latter never returns NULL and you can rely on that fact.
> > Our docummentation says:
> >  * %__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
> >  * cannot handle allocation failures. The allocation could block
> >  * indefinitely but will never return with failure. Testing for
> >  * failure is pointless.
> 
> So if the documentation is going to give similar guarantees, as
> opposed to it being an accident of the current implementation that is
> subject to change at any time, then sure, we can probably get away
> with all or most of ext4's uses of __GFP_NOFAIL.  But I don't want to
> do that and then have a "Lucy and Charlie Brown" moment from the
> Peanuts comics strip where the football suddenly gets snatched away
> from us[1] (and many file sytem users will be very, very sad and/or
> angry).

yeah absolutely, and the "what is a small allocation" limit needs to be
nailed down as well




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux