Re: [PATCH 1/1] quota: flush quota_release_work upon quota writeback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024/11/18 20:53, Jan Kara wrote:
On Mon 18-11-24 09:29:19, Baokun Li wrote:
On 2024/11/17 1:59, Ojaswin Mujoo wrote:
On Sat, Nov 16, 2024 at 02:20:26AM +0530, Ritesh Harjani wrote:
Ojaswin Mujoo <ojaswin@xxxxxxxxxxxxx> writes:

One of the paths quota writeback is called from is:

freeze_super()
    sync_filesystem()
      ext4_sync_fs()
        dquot_writeback_dquots()

Since we currently don't always flush the quota_release_work queue in
this path, we can end up with the following race:

   1. dquot are added to releasing_dquots list during regular operations.
   2. FS freeze starts, however, this does not flush the quota_release_work queue.
   3. Freeze completes.
   4. Kernel eventually tries to flush the workqueue while FS is frozen which
      hits a WARN_ON since transaction gets started during frozen state:

    ext4_journal_check_start+0x28/0x110 [ext4] (unreliable)
    __ext4_journal_start_sb+0x64/0x1c0 [ext4]
    ext4_release_dquot+0x90/0x1d0 [ext4]
    quota_release_workfn+0x43c/0x4d0

Which is the following line:

    WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);

Which ultimately results in generic/390 failing due to dmesg
noise. This was detected on powerpc machine 15 cores.

To avoid this, make sure to flush the workqueue during
dquot_writeback_dquots() so we dont have any pending workitems after
freeze.
Not just that, sync_filesystem can also be called from other places and
quota_release_workfn() could write out and and release the dquot
structures if such are found during processing of releasing_dquots list.
IIUC, this was earlier done in the same dqput() context but had races
with dquot_mark_dquot_dirty(). Hence the final dqput() will now add the
dquot structures to releasing_dquots list and will schedule a delayed
workfn which will process the releasing_dquots list.
Hi Ritesh,

Ohh right, thanks for the context. I see this was done here:

    dabc8b207566 quota: fix dqput() to follow the guarantees dquot_srcu
    should provide
Yup.

Nice catch! Thanks for fixing this up!

Have you tested the performance impact of this patch? It looks like the
unconditional call to flush_delayed_work() in dquot_writeback_dquots()
may have some performance impact for frequent chown/sync scenarios.
Well, but sync(2) or so is expensive anyway. Also dquot_writeback_dquots()
should persist all pending quota modifications and it is true that pending
dquot_release() calls can remove quota structures from the quota file and
thus are by definition pending modifications. So I agree with Ojaswin that
putting the workqueue flush there makes sense and is practically required
for data consistency guarantees.
Make sense.
When calling release_dquot(), we will only remove the quota of an object
(user/group/project) from disk if it is not quota-limited and does not
use any inode or block.

Asynchronous removal is now much more performance friendly, not only does
it make full use of the multi-core, but for scenarios where we have to
repeatedly chown between two objects, delayed release avoids the need to
repeatedly allocate/free space in memory and on disk.
True, but unless you call sync(2) in between these two calls this is going
to still hold.
Yeah without sync or syncfs, it's the same as before.
Overall, since the actual dirty data is already on the disk, there is no
consistency issue here as it is just clearing unreferenced quota on the
disk, so I thought maybe it would be better to call flush_delayed_work()
in the freeze context.
To summarise, I don't think real-life workloads are going to observe the
benefit and conceptually the call really belongs more to
dquot_writeback_dquots().

								Honza

Okay, thanks for the feedback!


Regards,
Baokun





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux