On 18/07/2024 09:53, John Garry wrote:
On 12/07/2024 00:20, Dave Chinner wrote:
/* Reflink'ed disallowed */
+ if (flags2 & XFS_DIFLAG2_REFLINK)
+ return __this_address;
Hmm. If we don't support reflink + forcealign ATM, then shouldn't the
superblock verifier or xfs_fs_fill_super fail the mount so that old
kernels won't abruptly emit EFSCORRUPTED errors if a future kernel adds
support for forcealign'd cow and starts writing out files with both
iflags set?
I don't think we should error out the mount because reflink and
forcealign are enabled - that's going to be the common configuration
for every user of forcealign, right? I also don't think we should
throw a corruption error if both flags are set, either.
We're making an initial*implementation choice* not to implement the
two features on the same inode at the same time. We are not making a
an on-disk format design decision that says "these two on-disk flags
are incompatible".
IOWs, if both are set on a current kernel, it's not corruption but a
more recent kernel that supports both flags has modified this inode.
Put simply, we have detected a ro-compat situation for this specific
inode.
Looking at it as a ro-compat situation rather then corruption,
what I would suggest we do is this:
1. Warn at mount that reflink+force align inodes will be treated
as ro-compat inodes. i.e. read-only.
I am looking at something like this to implement read-only for those inodes:
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 07f736c42460..444a44ccc11c 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -1132,6 +1132,17 @@ xfs_vn_tmpfile(
return finish_open_simple(file, err);
}
+static int xfs_permission(struct mnt_idmap *d, struct inode *inode, int
mask)
+{
+ struct xfs_inode *ip = XFS_I(inode);
+
+ if (mask & MAY_WRITE) {
+ if (xfs_is_reflink_inode(ip) && xfs_inode_has_forcealign(ip))
+ return -EACCES;
+ }
+ return generic_permission(d, inode, mask);
+}
+
static const struct inode_operations xfs_inode_operations = {
.get_inode_acl = xfs_get_acl,
.set_acl = xfs_set_acl,
@@ -1142,6 +1153,7 @@ static const struct inode_operations
xfs_inode_operations = {
.update_time = xfs_vn_update_time,
.fileattr_get = xfs_fileattr_get,
.fileattr_set = xfs_fileattr_set,
+ .permission = xfs_permission,
};
Or how else could this be done? I guess that we have something else in
the xfs code to implement the equivalent of this, but I did not find it.
2. prevent forcealign from being set if the shared extent flag is
set on the inode.
This is just XFS_DIFLAG2_REFLINK flag, right?
3. prevent shared extents from being created if the force align flag
is set (i.e. ->remap_file_range() and anything else that relies on
shared extents will fail on forcealign inodes).
In this series version I extend the RT check in xfs_reflink_remap_prep()
to cover forcealign - is that good enough?
4. if we read an inode with both set, we emit a warning and force
the inode to be read only so we don't screw up the force alignment
of the file (i.e. that inode operates in ro-compat mode.)
#1 is the mount time warning of potential ro-compat behaviour.
#2 and #3 prevent both from getting set on existing kernels.
#4 is the ro-compat behaviour that would occur from taking a
filesystem that ran on a newer kernel that supports force-align+COW.
This avoids corruption shutdowns and modifications that would screw
up the alignment of the shared and COW'd extents.
This seems fine for dealing with forcealign and reflink.
So what about forcealign and RT?
Any opinion on this?
We want to support this config in future, but the current implementation
will not support it.
In this v2 series, I just disallow a mount for forcealign and RT,
similar to reflink and RT together.
Furthermore, I am also saying here that still forcealign and RT bits set
is a valid inode on-disk format and we just have to enforce a
sb_rextsize to extsize relationship:
xfs_inode_validate_forcealign(
struct xfs_mount *mp,
uint32_t extsize,
uint32_t cowextsize,
uint16_t mode,
uint16_t flags,
uint64_t flags2)
{
bool rt = flags & XFS_DIFLAG_REALTIME;
...
/* extsize must be a multiple of sb_rextsize for RT */
if (rt && mp->m_sb.sb_rextsize && extsize % mp->m_sb.sb_rextsize)
return __this_address;
And this? If not, I'll just send v3 with this code as-is.
return NULL;
}