Re: How to handle an kmalloc failure in evict_inode()?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 04, 2014 at 07:41:01PM -0400, Theodore Ts'o wrote:
> 
> I've been trying to figure out the best way to handle potential memory
> allocation failures in evict_inode(), since there doesn't seem to be any
> way to reflect errors back up to the caller, or to delay the inode
> eviction until more memory might be available.
> 
> There doesn't seem to be a good solution; right now, in ext4, we mark
> the file system as being inconsistent, thus triggering a remount
> read-only or a panic, which seems to be.... sub-optimal.
> 
> I was looking to see what xfs does to handle this case, and as near as I
> can tell, xfs_fs_evict_inode() calls xfs_inactive(), which in turn calls
> xfs_free_eofblocks() --- and ignores the error return.

Sure, because failing to free EOF blocks on referenced inodes (i.e. link
count > 0) is not a serious problem. i.e. freeing speculative preallocation
beyond EOF is a best effort operation so lock inversions or memory
allocation failure simply means we don't do it. It'll get cleaned up
in the future (i.e. next time the inode gets pulled into cache).

> And when I look
> to see whether xfs_free_eofblocks() can return ENOMEM, it appears that
> it can, via a call path that involves xfs_bmapi_read(),
> xfs_iread_extents(), xfs_bmap_read_extents(), xfs_btree_read_bufl(),
> xfs_trans_read_buf(), xfs_trans_read_buf_map(), xfs_buf_read_map(),
> xfs_buf_get_map(), _xfs_buf_alloc(), which can return ENOMEM.

Yes, _xfs_buf_alloc() can return ENOMEM, but did you actually look
at the memory allocation code? i.e. the kmem_*alloc(KM_NOFS) calls?
Put simple, KM_NOFS allocations *never fail* in XFS, unless
KM_NOSLEEP (i.e. GFP_ATOMIC) or KM_MAYFAIL are also specified. So
_xfs_buf_alloc() will never actually return ENOMEM at this point.


Historically speaking, the core XFS code has little in the way of
ENOMEM error handling because the OS it came from (Irix) guaranteed
that critical memory allocations would always succeed. The
kmem_*alloc() wrappers still implement those semantics today on
Linux for the core XFS code.

We're slowly adding all the ENOMEM handling code we need through the
stack to handle ENOMEM sanely. However, until we have transaction
rollback code (i.e to cancel dirty transactions), we can't ever fail
memory allocations in transactions. Hence the error handling code is
slowly appearing, but the code allocator still doesn't allow failure
to occur....

> As near as I can tell, at least for ext4, we've been remarkably lucky,
> in that GFP_NOFS allocations seem to rarely if ever fail.

Oh, they fail quite frequently....

> However,
> under enough memory pressure, and in a situation where the OOM killer
> has been configured to be less aggressive, it is possible (which is as
> we would expect) for a kmalloc() to fail, and short of using
> _GFP_NOFAIL, or using a retry loop, I'm not sure there's a good answer
> to this problem.

Yup, that's why XFS uses __GFP_NOWARN and a retry loop - because
there are places where failure to allocate memory can have only
one result: denial of service via a filesystem shutdown.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux