On Tue 02-11-21 10:36:42, Joseph Qi wrote: > On 11/1/21 7:31 PM, Jan Kara wrote: > > On Thu 28-10-21 15:09:08, Joseph Qi wrote: > >> Hi Jan, > >> > >> On 10/25/21 11:13 PM, Jan Kara wrote: > >>> ocfs2_truncate_file() did unmap invalidate page cache pages before > >>> zeroing partial tail cluster and setting i_size. Thus some pages could > >>> be left (and likely have left if the cluster zeroing happened) in the > >>> page cache beyond i_size after truncate finished letting user possibly > >>> see stale data once the file was extended again. Also the tail cluster > >> > >> I don't quite understand the case. > >> truncate_inode_pages() will truncate pages from new_i_size to i_size, > >> and the following ocfs2_orphan_for_truncate() will zero range and then > >> update i_size for inode as well as dinode. > >> So once truncate finished, how stale data exposing happens? Or do you > >> mean a race case between the above two steps? > > > > Sorry, I was not quite accurate in the above paragraph. There are several > > ways how stale pages in the pagecache can cause problems. > > > > 1) Because i_size is reduced after truncating page cache, page fault can > > happen after truncating page cache and zeroing pages but before reducing i_size. > > This will in allow user to arbitrarily modify pages that are used for > > writing zeroes into the cluster tail and after file extension these data > > will become visible. > > > > 2) The tail cluster zeroing in ocfs2_orphan_for_truncate() can actually try > > to write zeroed pages above i_size (e.g. if we have 4k blocksize, 64k > > clustersize, and do truncate(f, 4k) on a 4k file). This will cause exactly > > same problems as already described in commit 5314454ea3f "ocfs2: fix data > > corruption after conversion from inline format". > > > > Hope it is clearer now. > > > So the core reason is ocfs2_zero_range_for_truncate() grabs pages and > then zero, right? Well, that is the part that makes things easy to reproduce. > I think an alternative way is using zeroout instead of zero pages, which > won't grab pages again. That would certainly reduce the likelyhood of problems but it is always problematic to first truncate page cache and only then update i_size. For OCFS2 racing page faults can recreate pages in the page cache before i_size is reduced and thus cause "interesting" problems. > Anyway, I'm also fine with your way since it is simple. > > Reviewed-by: Joseph Qi <joseph.qi@xxxxxxxxxxxxxxxxx> Thanks! Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR