Re: [rfc] fsync_range?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nick Piggin wrote:
> Again it comes back to the whole writeout thing, which makes it more
> constraining on the kernel to optimise.

Cute :-)
It was intended to make it easier to optimise, but maybe it failed.

> For example, my fsync "livelock" avoidance patches did the following:
> 
> 1. find all pages which are dirty or under writeout first.
> 2. write out the dirty pages.
> 3. wait for our set of pages.
> 
> Simple, obvious, and the kernel can optimise this well because the
> userspace has asked for a high level request "make this data safe"
> rather than low level directives. We can't do this same nice simple
> sequence with sync_file_range because SYNC_FILE_RANGE_WAIT_AFTER
> means we have to wait for all writeout pages in the range, including
> unrelated ones, after the dirty writeout. SYNC_FILE_RANGE_WAIT_BEFORE
> means we have to wait for clean writeout pages before we even start
> doing real work.

As noted in my other mail just now, although sync_file_range() is
described as though it does the three bulk operations consecutively, I
think it wouldn't be too shocking to think the intended semantics
_could_ be:

    "wait and initiate writeous _as if_ we did, for each page _in parallel_ {
        if (SYNC_FILE_RANGE_WAIT_BEFORE && page->writeout) wait(page)
        if (SYNC_FILE_RANGE_WRITE) start_writeout(page)
        if (SYNC_FILE_RANGE_WAIT_AFTER && writeout) wait(page)
     }"

That permits many strategies, and I think one of them is the nice
livelock-avoiding fsync you describe up above.

You might be able to squeeze the sync_file_range() flags into that by
chopping it up like this.  Btw, you omitted step 1.5 "wait for dirty
pages which are already under writeout", but it's made explicit here:

    1. find all pages which are dirty or under writeout first,
       and remember which of them are dirty _and_ under writeout (DW).
    2. if (SYNC_FILE_RANGE_WRITE)
           write out the dirty pages not in DW.
    3. if (SYNC_FILE_RANGE_WAIT_BEFORE) {
           wait for the set of pages in DW.
           write out the pages in DW.
       }
    4. if (SYNC_FILE_RANGE_WAIT_BEFORE || SYNC_FILE_RANGE_WAIT_AFTER)
           wait for our set of pages.

However, maybe the flags aren't all that useful really, and maybe
sync_file_range() could be replaced by a stub which ignores the flags
and calls fsync_range().

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux