Re: [fuse-devel] [fuse] interaction between O_APPEND and writeback cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Aug 04 2017, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> On Fri, Aug 4, 2017 at 9:10 PM, Nikolaus Rath <Nikolaus@xxxxxxxx> wrote:
>> Hello,
>>
>> I am confused about how O_APPEND is supposed to interact with the
>> writeback cache.
>>
>> As far as I can tell, the O_APPEND flag is currently passed to the
>> filesystem process, so my expectation is that the filesystem process is
>> responsible for ignoring any offset in write requests and instead write
>> at the current end of the file[1].
>>
>> However, with writeback cache enabled the filesystem process cannot tell
>> which data is "new" and came from userspace, should be appended, and
>> which data is old and just made a round-trip to the kernel. So it seems
>> to me that the filesystem process should probably leave the handling of
>> O_APPEND to the kernel. But then, shouldn't the kernel filter out this
>> flag when sending the open request?
>
> Indeed, when writing back the cache the kernel should definitely not
> set O_APPEND.

Well, 4.9 certainly does it though. Should I try to make a patch, or are
you or Maxim going to do that shortly anyway?

Do you think it makes sense to filter out O_APPEND in libfuse as well
(to work around the issue for present day kernels)?

>> On the other hand, when the kernel handles O_APPEND, then it is no
>> longer atomic (think of a network fuse filesystem).
>
> Yes, network filesystem generally needs to handle consistency of
> caches across nodes and O_APPEND in no exception (i.e. you cannot have
> two nodes writing O_APPEND to cache at the same time, because that
> will not work).

This poses a bit of a problem though. So a network filesystem either
cannot use writeback caching or O_APPEND will (silently) not work.

With the current behavior (O_APPEND being passed to open() when
writeback is enabled) the filesystem would at least have a chance to
return an error, i.e. instead of a silent failure there would be a noisy
error. With that in mind, maybe the current behavior isn't so bad? We'd
just have to document that if writeback cache is enabled and O_APPEND
is received, the filesystem has to decide if it is fine with the kernel
handling O_APPEND (and in that case ignore the flag for subsequent
writes) or return an error.


Best,
-Nikolaus


-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux