[no subject]

**Date** **Thread**

This does leave a hole in some corner cases.  I'm content to say "don't do that"
if you want to use these hooks.

Optionally we could add a FAN_PRE_MMAP hook in vm_mmap() for the range that is
being mmap'ed to make sure we never miss any events, and then applications can
decide if they want to risk it with the pagefault hooks or enable the mmap hooks
for absolute certainty.

> 
> > > > +		int mask = (vmf->flags & FAULT_FLAG_WRITE) ? MAY_WRITE : MAY_READ;
> > > > +		loff_t pos = vmf->pgoff << PAGE_SHIFT;
> > > > +
> > > > +		fpin = maybe_unlock_mmap_for_io(vmf, fpin);
> > > > +
> > > > +		/*
> > > > +		 * We can only emit the event if we did actually release the
> > > > +		 * mmap lock.
> > > > +		 */
> > > > +		if (fpin) {
> > > > +			error = fsnotify_file_area_perm(fpin, mask, &pos,
> > > > +							PAGE_SIZE);
> > > > +			if (error) {
> > > > +				fput(fpin);
> > > > +				return VM_FAULT_ERROR;
> > > > +			}
> > > > +		}
> > > > +	}
> > > > +
> > > >  	/*
> > > >  	 * Do we have something in the page cache already?
> > > >  	 */
> > > ...
> > > > @@ -3612,6 +3643,13 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
> > > >  	unsigned long rss = 0;
> > > >  	unsigned int nr_pages = 0, mmap_miss = 0, mmap_miss_saved, folio_type;
> > > >  
> > > > +	/*
> > > > +	 * We are under RCU, we can't emit events here, we need to force a
> > > > +	 * normal fault to make sure the events get sent.
> > > > +	 */
> > > > +	if (fsnotify_file_has_content_watches(file))
> > > > +		return ret;
> > > > +
> > > 
> > > I don't think we need to do anything for filemap_map_pages(). The call just
> > > inserts page cache content into page tables and whatever is in the page
> > > cache and has folio_uptodate() set should be already valid file content,
> > > shouldn't it?
> > 
> > I'll make this comment more clear.  filemap_fault() will start readahead,
> > but we'll only emit the event for the page size that we're faulting.  I
> > had looked at putting this at the readahead place and figuring out the
> > readahead size, but literally anything could trigger readahead so it's
> > better to just not allow filemap_map_pages() to happen, otherwise we'll
> > end up with empty pages (if the content hasn't been populated yet) and
> > never emit an event for those ranges.
> 
> This seems like an interesting problem. Even ordinary read(2) will trigger
> readahead and as you say, we would be instantiating folios with wrong
> content (zeros) due to that. It seems as a fragile design to keep such
> folios in the page cache and place checks in all the places that could
> possibly make their content visible to the user. I'd rather make sure that
> if we pull folios into page cache (and set folio_uptodate() bit), their
> content is indeed valid.

The hook exists before we go looking in pagecache, so we're fine with read(2),
the only problem is mmap (AFAICT, I am not very smart after all).

> 
> What we could do is to turn off readahead on the inode if
> fsnotify_file_has_content_watches() is true. Essentially the handler of the
> precontent event can do a much better job of prefilling the page cache with
> whatever content is needed in a range that makes sense. And then we could
> leave filemap_map_pages() intact. What do you think guys?

I had considered this but decided against it because it seemed like a big
hammer, but if you're cool with it then so am I.  Thanks,

Josef