On Thu, May 14, 2020 at 09:56:58AM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > ... > > Fix this by changing the ondisk dquot initialization function to use > ordered buffers to write out fresh dquot blocks if it detects that we're > running quotacheck. If the system goes down before quotacheck can > complete, the CHKD flags will not be set in the superblock and the next > mount will run quotacheck again, which can fix uninitialized dquot > buffers. This requires amending the defer code to maintaine ordered > buffer state across defer rolls for the sake of the dquot allocation > code. > > For regular operations we preserve the current behavior since the dquot > items require properly initialized ondisk dquot records. > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > --- > v2: rework the code comment explaining all this > --- > fs/xfs/libxfs/xfs_defer.c | 10 +++++++ > fs/xfs/xfs_dquot.c | 62 ++++++++++++++++++++++++++++++++++++--------- > 2 files changed, 58 insertions(+), 14 deletions(-) > ... > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c > index 52e0f7245afc..f60a8967f9d5 100644 > --- a/fs/xfs/xfs_dquot.c > +++ b/fs/xfs/xfs_dquot.c ... > @@ -238,11 +240,45 @@ xfs_qm_init_dquot_blk( ... > + > + /* > + * When quotacheck runs, we use delayed writes to update all the dquots > + * on disk in an efficient manner instead of logging the individual > + * dquot changes as they are made. > + * > + * Hence if we log the buffer that we allocate here, then crash > + * post-quotacheck while the logged initialisation is still in the > + * active region of the log, we can lose the information quotacheck > + * wrote directly to the buffer. That is, log recovery will replay the > + * dquot buffer initialisation over the top of whatever information > + * quotacheck had written to the buffer. > + * > + * To avoid this problem, dquot allocation during quotacheck needs to > + * avoid logging the initialised buffer, but we still need to have > + * writeback of the buffer pin the tail of the log so that it is > + * initialised on disk before we remove the allocation transaction from > + * the active region of the log. Marking the buffer as ordered instead > + * of logging it provides this behaviour. > + * > + * If we crash before quotacheck completes, a subsequent quotacheck run > + * will re-allocate and re-initialize the dquot records as needed. > + */ I took a stab at condensing the comment a bit, FWIW (diff below). LGTM either way. Thanks for the update. Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> > + if (!(mp->m_qflags & qflag)) > + xfs_trans_ordered_buf(tp, bp); > + else > + xfs_trans_log_buf(tp, bp, 0, BBTOB(q->qi_dqchunklen) - 1); > } > > /* > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index f60a8967f9d5..55b95d45303b 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -254,26 +254,20 @@ xfs_qm_init_dquot_blk( xfs_trans_dquot_buf(tp, bp, blftype); /* - * When quotacheck runs, we use delayed writes to update all the dquots - * on disk in an efficient manner instead of logging the individual - * dquot changes as they are made. + * quotacheck uses delayed writes to update all the dquots on disk in an + * efficient manner instead of logging the individual dquot changes as + * they are made. However if we log the buffer allocated here and crash + * after quotacheck while the logged initialisation is still in the + * active region of the log, log recovery can replay the dquot buffer + * initialisation over the top of the checked dquots and corrupt quota + * accounting. * - * Hence if we log the buffer that we allocate here, then crash - * post-quotacheck while the logged initialisation is still in the - * active region of the log, we can lose the information quotacheck - * wrote directly to the buffer. That is, log recovery will replay the - * dquot buffer initialisation over the top of whatever information - * quotacheck had written to the buffer. - * - * To avoid this problem, dquot allocation during quotacheck needs to - * avoid logging the initialised buffer, but we still need to have - * writeback of the buffer pin the tail of the log so that it is - * initialised on disk before we remove the allocation transaction from - * the active region of the log. Marking the buffer as ordered instead - * of logging it provides this behaviour. - * - * If we crash before quotacheck completes, a subsequent quotacheck run - * will re-allocate and re-initialize the dquot records as needed. + * To avoid this problem, quotacheck cannot log the initialised buffer. + * We must still dirty the buffer and write it back before the + * allocation transaction clears the log. Therefore, mark the buffer as + * ordered instead of logging it directly. This is safe for quotacheck + * because it detects and repairs allocated but initialized dquot blocks + * in the quota inodes. */ if (!(mp->m_qflags & qflag)) xfs_trans_ordered_buf(tp, bp);