On Mon, Oct 2, 2017 at 1:49 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > On Sun, Oct 01, 2017 at 03:10:03PM -0700, Sargun Dhillon wrote: >> I'm running into an issue where xfs aild is locking up. This is on >> kernel version 4.9.34. It's an SMP system with 32 cores, and ~250G of >> RAM (AWS R4.8XL) and an XFS filesystem with 1 SSD with project ID >> quotas in use. It's the only XFS filesystem on the host. The root >> partition is running EXT4, and isn't involved in this. >> >> There are containers that use overlayfs atop this filesystem. It looks >> like one of the processes (10090, or 11504) has gotten into a state >> where it's holding a lock on a xfs_buf, and they're trying to lock >> xfs_buf's which are currently on the xfs ail list. >> ... > Ok, this is a RENAME_WHITEOUT case, and that points to the issue. > The whiteout inode is allocated as a temporary inode, which means > it remains on the unlinked list so that if we crash part way through > the update log recovery will free it again. > > Once all the dirent updates and other rename work is done, we remove > the whiteout inode from the unlinked list, and that requires > grabbing the AGI lock. That's what we are stuck on here. > ... > > Because this is the deadlock - we're trying to lock the AGF with an > AGI already locked. That means the above RENAME_WHITEOUT has either > allocated or freed extents in manipulating the dirents during > rename, and so holds an AGF locked. It's a classic ABBA deadlock. > > That's the problem, not sure what the solution is yet - there's no > obvious or simple way around this RENAME_WHITEOUT behaviour (which > only affects overlay, fwiw). I'll have a think about it. > Dave, Could you explain why the RENAME_WHITEOUT case is different locking order wise from linking an O_TEMPFILE? Is it because xfs_iunlink_remove() is called before xfs_dir_createname() in xfs_link()? Also, in xfs_rename(), before removing whiteout inode from unlinked list, the comment says: "If we fail here after bumping the link * count, we're shutting down the filesystem so we'll never see the * intermediate state on disk.", but I am not actually seeing where that shutdown takes place, or maybe I don't know what to look for. Thanks, Amir. -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html