[PATCH] zonefs: Always invalidate last cache page on append write

Damien Le Moal <damien.lemoal@xxxxxxxxxxxxxxxxxx> · Wed, 29 Mar 2023 14:58:23 +0900

When a direct append write is executed, the append offset may correspond
to the last page of an inode which might have been cached already by
buffered reads, page faults with mmap-read or non-direct readahead.
To ensure that the on-disk and cached data is consistant for such last
cached page, make sure to always invalidate it in
zonefs_file_dio_append(). This invalidation will always be a no-op when
the device block size is equal to the page size (e.g. 4K).

Reported-by: Hans Holmberg <Hans.Holmberg@xxxxxxx>
Fixes: 02ef12a663c7 ("zonefs: use REQ_OP_ZONE_APPEND for sync DIO")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Damien Le Moal <damien.lemoal@xxxxxxxxxxxxxxxxxx>
---
 fs/zonefs/file.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index 617e4f9db42e..eeab8b93493b 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -390,6 +390,18 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
 	max = ALIGN_DOWN(max << SECTOR_SHIFT, inode->i_sb->s_blocksize);
 	iov_iter_truncate(from, max);
 
+	/*
+	 * If the inode block size (sector size) is smaller than the
+	 * page size, we may be appending data belonging to an already
+	 * cached last page of the inode. So make sure to invalidate that
+	 * last cached page. This will always be a no-op for the case where
+	 * the block size is equal to the page size.
+	 */
+	ret = invalidate_inode_pages2_range(inode->i_mapping,
+					    iocb->ki_pos >> PAGE_SHIFT, -1);
+	if (ret)
+		return ret;
+
 	nr_pages = iov_iter_npages(from, BIO_MAX_VECS);
 	if (!nr_pages)
 		return 0;
-- 
2.39.2