On Sat 16-11-24 00:04:49, Ojaswin Mujoo wrote: > One of the paths quota writeback is called from is: > > freeze_super() > sync_filesystem() > ext4_sync_fs() > dquot_writeback_dquots() > > Since we currently don't always flush the quota_release_work queue in > this path, we can end up with the following race: > > 1. dquot are added to releasing_dquots list during regular operations. > 2. FS freeze starts, however, this does not flush the quota_release_work queue. > 3. Freeze completes. > 4. Kernel eventually tries to flush the workqueue while FS is frozen which > hits a WARN_ON since transaction gets started during frozen state: > > ext4_journal_check_start+0x28/0x110 [ext4] (unreliable) > __ext4_journal_start_sb+0x64/0x1c0 [ext4] > ext4_release_dquot+0x90/0x1d0 [ext4] > quota_release_workfn+0x43c/0x4d0 > > Which is the following line: > > WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE); > > Which ultimately results in generic/390 failing due to dmesg > noise. This was detected on powerpc machine 15 cores. > > To avoid this, make sure to flush the workqueue during > dquot_writeback_dquots() so we dont have any pending workitems after > freeze. > > Reported-by: Disha Goel <disgoel@xxxxxxxxxxxxx> > Signed-off-by: Ojaswin Mujoo <ojaswin@xxxxxxxxxxxxx> Thanks for debugging this! > diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c > index 3dd8d6f27725..2782cfc8c302 100644 > --- a/fs/quota/dquot.c > +++ b/fs/quota/dquot.c > @@ -729,6 +729,8 @@ int dquot_writeback_dquots(struct super_block *sb, int type) > sb->dq_op->write_info(sb, cnt); > dqstats_inc(DQST_SYNCS); > > + flush_delayed_work("a_release_work); > + I'd rather do this at the start of dquot_writeback_dquots(). Chances are this saves some retry loops in the dirty list iterations. That being said I don't think this is enough as I'm thinking about it. iput() can be called anytime while the filesystem is frozen (just freeze the filesystem and do echo 3 >/proc/sys/vm/drop_caches) which will consequently call dquot_drop() -> dqput(). This should not be really freeing the dquot on-disk structure (the inode itself is still accounted there) but nevertheless it may end up dropping the last dquot in-memory reference and ext4_release_dquot() will call ext4_journal_start() and complain. So I think on top of this patch which makes sense on its own and deals with 99.9% of cases, we also need ext4 specific fix which uses sb_start_intwrite() to get freeze protection in ext4_release_dquot() (and in principle we always needed this, delayed dquot releasing does not influence this particular problem). Some care will be needed if the transaction is already started when ext4_release_dquot() is called - you can take inspiration in how ext4_evict_inode() handles this. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR