Re: [GIT PULL] Fsnotify changes for 6.12-rc1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 24, 2024 at 06:15:51PM -0700, Linus Torvalds wrote:
> On Tue, 24 Sept 2024 at 17:17, Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
> >
> > Just side note: I think `generic_file_vm_ops` already prepares this
> > feature, so generic_file_mmap users also have fault around behaviors.
> 
> Hmm. Maybe. But it doesn't really change the fundamental issue - the
> code in question seems to be just *random*.
> 
> And I mean that in a very real and very immediate sense: the
> fault-around code and filemap_map_pages() only maps in pages that are
> uptodate, so it literally DEPENDS ON TIMING whether some previous IO
> has completed or not, and thus on whether the page fault is handled by
> the fault-around in filemap_map_pages() or by the filemap_fault()
> code.
> 
> In other words - I think this is all completely broken.
> 
> Put another way: explain to me why random IO timing details should
> matter for the whether we do __filemap_fsnotify_fault() on a page
> fault or not?

They don't, or at least I don't understand what you're getting at here.

I have a file with these precontent watches on, no IO is going to occur on these
pages without calling the fanotify hook first.  We have readahead disabled in
this case for this very reason, we need to make sure that every single page
access is caught by fanotify before we allow the IO to proceed, otherwise we do
get random garbage (well really we just get empty pages because these files are
just truncated to the final size).

Now there is the case where userspace could have just done a normal pwrite() to
fill in the file, so any subsequent faults would find uptodate pages and thus
we'd not get an event.  My original code disabled this, specifically because I
wanted to have the consistency of "any access == event".  But we decided in one
of our many discussions around this code that it would be a bit too heavy
handed, and if there were uptodate pages it would be because the HSM application
filled them in.

As Jan pointed out, I'm fine going back to that model, as it does have more
consistent behavior.  But I don't want to go rework this code for a 6th time
just to have you reject it and it take me a couple of weeks to notice that it
didn't go in.

I realize to you that "we're already using this" doesn't matter, but it matters
to me.  As a rule we don't like carrying patches that aren't upstream, the only
reason we do it is for these cases where we need to see how it works in
production to see if it's actually worth chasing and implementing.  In this case
the answer overwhelmingly is yes, this feature is extremely useful and pretty
greatly improves the launch times of our containers and our applications.  So
I'd really like some concrete guidance on how exactly you want this to look
before I go back to rework it again.  Thanks,

Josef




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux