Re: [PATCH v9 0/9] Implement a batched fsync option for core.fsyncObjectFiles

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 9, 2022 at 3:10 PM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote:
>
> Replying to an old-ish E-Mail of mine with some more thought that came
> to mind after[1] (another recently resurrected fsync() thread).
>
> I wonder if there's another twist on the plan outlined in [2] that would
> be both portable & efficient, i.e. the "slow" POSIX way to write files
> A..Z is to open/write/close/fsync each one, so we'll trigger a HW flush
> N times.
>
> And as we've discussed, doing it just on Z will implicitly flush A..Y on
> common OS's in the wild, which we're taking advantage of here.
>
> But aside from the rename() dance in[2], what do those OS's do if you
> write A..Z, fsync() the "fd" for Z, and then fsync A..Y (or, presumably
> equivalently, in reverse order: Y..A).
>
> I'd think they'd be smart enough to know that they already implicitly
> flushed that data since Z was flushend, and make those fsync()'s a
> rather cheap noop.
>
> But I don't know, hence the question.
>
> If that's true then perhaps it's a path towards having our cake and
> eating it too in some cases?
>
> I.e. an FS that would flush A..Y if we flush Z would do so quickly and
> reliably, whereas a FS that doesn't have such an optimization might be
> just as slow for all of A..Y, but at least it'll be safe.
>
> 1. https://lore.kernel.org/git/220309.867d93lztw.gmgdl@xxxxxxxxxxxxxxxxxxx/
> 2. https://lore.kernel.org/git/e1747ce00af7ab3170a69955b07d995d5321d6f3.1637020263.git.gitgitgadget@xxxxxxxxx/

The important angle here is that we need some way to indicate to the
OS what A..Y is before we fsync on Z.  I.e. the OS will cache any
writes in memory until some sync-ish operation is done on *that
specific file*.  Syncing just 'Z' with no sync operations on A..Y
doesn't indicate that A..Y would get written out.  Apparently the bad
old ext3 behavior was similar to what you're proposing where a sync on
'Z' would imply something about independent files.

Here's an interesting paper I recently came across that proposes the
interface we'd really want, 'syncv':
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.924.1168&rep=rep1&type=pdf.

Thanks,
Neeraj




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux