On Wed 20-11-24 16:57:30, Amir Goldstein wrote: > On Wed, Nov 20, 2024 at 4:23 PM Jan Kara <jack@xxxxxxx> wrote: > > > > On Fri 15-11-24 10:30:22, Josef Bacik wrote: > > > From: Amir Goldstein <amir73il@xxxxxxxxx> > > > > > > Generate FS_PRE_ACCESS event before truncate, without sb_writers held. > > > > > > Move the security hooks also before sb_start_write() to conform with > > > other security hooks (e.g. in write, fallocate). > > > > > > The event will have a range info of the page surrounding the new size > > > to provide an opportunity to fill the conetnt at the end of file before > > > truncating to non-page aligned size. > > > > > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> > > > > I was thinking about this. One small issue is that similarly as the > > filesystems may do RMW of tail page during truncate, they will do RMW of > > head & tail pages on hole punch or zero range so we should have some > > strategically sprinkled fsnotify_truncate_perm() calls there as well. > > That's easy enough to fix. > > fallocate already has fsnotify_file_area_perm() hook. > What is missing? Sorry, I've missed that in the patch that was adding it. > > But there's another problem which I'm more worried about: If we have > > a file 64k large, user punches 12k..20k and then does read for 0..64k, then > > how does HSM daemon in userspace know what data to fill in? When we'll have > > modify pre-content event, daemon can watch it and since punch will send modify > > for 12k-20k, the daemon knows the local (empty) page cache is the source of > > truth. But without modify event this is just a recipe for data corruption > > AFAICT. > > > > So it seems the current setting with access pre-content event has only chance > > to work reliably in read-only mode? So we should probably refuse writeable > > open if file is being watched for pre-content events and similarly refuse > > truncate? > > I am confused. not sure I understand the problem. > > In the events that you specific, punch hole WILL generate a FS_PRE_ACCESS > event for 12k-20k. > > When HSM gets a FS_PRE_ACCESS event for 12k-20k it MUST fill the content > and keep to itself that 12k-20k is the source of truth from now on. Ah, right. I've got confused and didn't realize we'll be sending FS_PRE_ACCESS for 12k-20k. Thanks for clarification! > The extra FS_PRE_ACCESS event on 0..64k absolutely does not change that. > IOW, a FS_PRE_ACCESS event on 0..64k definitely does NOT mean that > HSM NEEDS to fill content in 0..64k, it just means that it MAY needs > to fill content > if it hasn't done that for a range before the event. > > To reiterate this important point, it is HSM responsibility to maintain the > "content filled map" per file is its own way, under no circumstances it is > assumed that fiemap or page cache state has anything to do with the > "content filled map". > > The *only* thing that HSM can assume if that if its "content filled map" > is empty for some range, then page cache is NOT yet populated in that > range and that also relies on how HSM and mount are being initialized > and exposed to users. OK, understood and makes sense. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR