quotaoff, transaction quiesce, and dquot logging

Brian Foster <bfoster@xxxxxxxxxx> · Fri, 4 Sep 2020 11:59:49 -0400

Hi Dave,

I'm finally getting back to the quotaoff thing we discussed a ways
back[1] and doing some auditing to make sure that I understand the
approach and that it seems correct. To refresh, your original prototype
and the slightly different one I'm looking into implement the same
general scheme:

1.) quiesce the transaction subsystem
2.) disable quota(s) (update state flags)
3.) log quotaoff start/end items (synchronous)
4.) open the transaction subsystem
5.) release all inode dquot references and purge dquots

The idea is that the critical invariant requred for quotaoff is that no
dquots are logged after the quotaoff end item is committed to the log.
Otherwise there is no guarantee that the tail pushes past the quotaoff
item and a subsequent crash/recovery incorrectly replays dquot changes
for an inactive quota mode.

As it is, I think there's at least one assumption we've made that isn't
entirely accurate. It looks to me that steps 1-4 don't guarantee that
dquots aren't logged after the transaction subsystem is released. The
current code (and my prototype) only clear the *QUOTA_ACTIVE flags at
that point, and various transactions might have already acquired or
attached dquots to inodes before the transaction allocation even occurs.
Once the transaction is allocated, various paths basically only care if
we have a dquot or not.

For example, xfs_create() gets the dquots up front, allocs the
transaction and xfs_trans_reserve_quota_bydquots() attaches any of the
associated dquots to the transaction. xfs_trans_reserve_quota_bydquots()
checks for (!QUOTA_ON() || !QUOTA_RUNNING()), but those only help us if
all quotas have been disabled. Consider if one of multiple active quotas
are being turned off, and that this path already has dquots for both,
for example.

I do notice that your prototype[1] clears all of the quota flags (not
just the ACTIVE flags) after the transaction barrier is released. This
prevents further modifications in some cases, but it doesn't seem like
that is enough to avoid violating the invariant described above. E.g.,
xfs_trans_apply_dquot_deltas() logs the dquot regardless of whether
changes are made (and actually looks like it can make some changes on
the dquot even if the transaction doesn't) after the dquot is attached
to the transaction.

This does make me wonder a bit whether we should rework the transaction
commit path to avoid modifying/logging the dquot completely if the quota
is inactive or accounting is disabled. When starting to look around with
that in mind, I see the following in xfs_quota_defs.h:

/*
 * Checking XFS_IS_*QUOTA_ON() while holding any inode lock guarantees
 * quota will be not be switched off as long as that inode lock is held.
 */
#define XFS_IS_QUOTA_ON(mp)     ((mp)->m_qflags & (XFS_UQUOTA_ACTIVE | \
                                                   XFS_GQUOTA_ACTIVE | \
                                                   XFS_PQUOTA_ACTIVE))
...

So I'm wondering how safe that actually would be, or even how safe it is
to clear the ACCT|ENFD flags before we release/purge dquots. It seems
like that conflicts with the above documentation, at least, but I'm not
totally clear on the reason for that rule. In any event, I'm still
poking around a bit, but unless I'm missing something in the analysis
above it doesn't seem like this is a matter of simply altering the
quotaoff path as originally expected. Thoughts or ideas appreciated.

Brian

[1] https://lore.kernel.org/linux-xfs/20200702115144.GH2005@xxxxxxxxxxxxxxxxxxx/