[RFD] XFS inode reclaim (inactivation) under fs freeze

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

We have reports on distro (pre-deferred inactivation) kernels that inode
reclaim (i.e. via drop_caches) can deadlock on the s_umount lock when
invoked on a frozen XFS fs. This occurs because drop_caches acquires the
lock and then blocks in xfs_inactive() on transaction alloc for an inode
that requires an eofb trim. Unfreeze blocks on the same lock and thus
the fs is deadlocked (in a frozen state). As far as I'm aware, this has
been broken for some time and probably just not observed because reclaim
under freeze is a rare and unusual situation.
    
With deferred inactivation, the deadlock problem actually goes away
because ->destroy_inode() will never block when the filesystem is
frozen. There is new/odd behavior, however, in that lookups of a pending
inactive inode spin loop waiting for the pending inactive state to
clear. That won't happen until the fs is unfrozen.

Also, the deferred inactivation queues are not consistently flushed on
freeze. I've observed that xfs_freeze invokes an xfs_inodegc_flush()
indirectly via xfs_fs_statfs(), but fsfreeze does not. Therefore, I
suspect it may be possible to land in this state from the onset of a
freeze based on prior reclaim behavior. (I.e., we may want to make this
flush explicit on freeze, depending on the eventual solution.)

Some internal discussion followed on potential improvements in response
to the deadlock report. Dave suggested potentially preventing reclaim of
inodes that would require inactivation, keeping them in cache, but it
appears we may not have enough control in the local fs to guarantee this
behavior out of the vfs and shrinkers (Dave can chime in on details, if
needed). He also suggested skipping eofb trims and sending such inodes
directly to reclaim. My current preference is to invoke an inodegc flush
and blockgc scan during the freeze sequence so presumably no pending
inactive but potentially accessible (i.e. not unlinked) inodes can
reside in the queues for the duration of a freeze. Perhaps others have
different ideas or thoughts on these.

In any event, this is an FYI/RFD given that deferred inactivation is
fairly new and I'm not sure we have a concrete sense of whether the
resulting changes in behavior might be more or less observable (and/or
disruptive) to users than the historical, more severe problem.

Brian




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux