On Mon, Aug 04, 2014 at 07:41:01PM -0400, Theodore Ts'o wrote: > > I've been trying to figure out the best way to handle potential memory > allocation failures in evict_inode(), since there doesn't seem to be any > way to reflect errors back up to the caller, or to delay the inode > eviction until more memory might be available. > > There doesn't seem to be a good solution; right now, in ext4, we mark > the file system as being inconsistent, thus triggering a remount > read-only or a panic, which seems to be.... sub-optimal. > > I was looking to see what xfs does to handle this case, and as near as I > can tell, xfs_fs_evict_inode() calls xfs_inactive(), which in turn calls > xfs_free_eofblocks() --- and ignores the error return. Sure, because failing to free EOF blocks on referenced inodes (i.e. link count > 0) is not a serious problem. i.e. freeing speculative preallocation beyond EOF is a best effort operation so lock inversions or memory allocation failure simply means we don't do it. It'll get cleaned up in the future (i.e. next time the inode gets pulled into cache). > And when I look > to see whether xfs_free_eofblocks() can return ENOMEM, it appears that > it can, via a call path that involves xfs_bmapi_read(), > xfs_iread_extents(), xfs_bmap_read_extents(), xfs_btree_read_bufl(), > xfs_trans_read_buf(), xfs_trans_read_buf_map(), xfs_buf_read_map(), > xfs_buf_get_map(), _xfs_buf_alloc(), which can return ENOMEM. Yes, _xfs_buf_alloc() can return ENOMEM, but did you actually look at the memory allocation code? i.e. the kmem_*alloc(KM_NOFS) calls? Put simple, KM_NOFS allocations *never fail* in XFS, unless KM_NOSLEEP (i.e. GFP_ATOMIC) or KM_MAYFAIL are also specified. So _xfs_buf_alloc() will never actually return ENOMEM at this point. Historically speaking, the core XFS code has little in the way of ENOMEM error handling because the OS it came from (Irix) guaranteed that critical memory allocations would always succeed. The kmem_*alloc() wrappers still implement those semantics today on Linux for the core XFS code. We're slowly adding all the ENOMEM handling code we need through the stack to handle ENOMEM sanely. However, until we have transaction rollback code (i.e to cancel dirty transactions), we can't ever fail memory allocations in transactions. Hence the error handling code is slowly appearing, but the code allocator still doesn't allow failure to occur.... > As near as I can tell, at least for ext4, we've been remarkably lucky, > in that GFP_NOFS allocations seem to rarely if ever fail. Oh, they fail quite frequently.... > However, > under enough memory pressure, and in a situation where the OOM killer > has been configured to be less aggressive, it is possible (which is as > we would expect) for a kmalloc() to fail, and short of using > _GFP_NOFAIL, or using a retry loop, I'm not sure there's a good answer > to this problem. Yup, that's why XFS uses __GFP_NOWARN and a retry loop - because there are places where failure to allocate memory can have only one result: denial of service via a filesystem shutdown. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html