Re: How to handle an kmalloc failure in evict_inode()?

"Theodore Ts'o" <tytso@xxxxxxx> · Tue, 5 Aug 2014 13:21:23 -0400

On Tue, Aug 05, 2014 at 10:17:17PM +1000, Dave Chinner wrote:
> IOWs, the longer term plan is to move all this stuff to async
> workqueue processing and so be able to defer and batch unlink and
> reclaim work more efficiently:

> http://xfs.org/index.php/Improving_inode_Caching#Inode_Unlink

I discussed doing this for ext4 a while back (because on a very busy
machine, unlink latency can be quite large).  I got pushback because
people were concerned that if a very large directory is getting
deleted --- say, you're cleaning up a the directory belonging to a
(for example, a Docker / Borg / Omega) job that has been shut down, so
the equivalent of an "rm -rf" of several hundred files comprising tens
or hundreds of megabytes or gigabytes, the fact that all of the unlink
have returned without the space not being available could confuse a
number of programs.  And it's not just "df", but if the user is over
quota, the fact that they still aren't allowed to write for seconds or
minutes because the block release isn't taking place except in a
workqueue that could potentially get deferred for a non-trivial amount
of time.

I could imagine recruiting the process that tries to do a block
allocation that would otherwise would have failed with a ENOSPC or
EDQUOT to help with the completing the deallocation of inodes to help
release disk space, but then we're moving the latency variability from
the unlink() call to an otherwise innocent production job that is
trying to do file writes.  So the user visibility is more than just
the df statistics; it's also some file writes either failing or
suffering increased latency until the blocks can be reclaimed.

Has the XFS developers considered these sort of concerns, and are
there any solutions to these issues that you've contemplated?

Cheers,

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html