On Wed, Feb 28, 2024 at 07:44:32AM -0800, Christoph Hellwig wrote: > On Mon, Feb 26, 2024 at 06:21:23PM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > Introduce a pair of new ioctls to handle exchanging ranges of bytes > > between files. The goal here is to perform the exchange atomically with > > respect to applications -- either they see the file contents before the > > exchange or they see that A-B is now B-A, even if the kernel crashes. > > > > The simpler of the two ioctls is XFS_IOC_EXCHANGE_RANGE, which performs > > the exchange unconditionally. XFS_IOC_COMMIT_RANGE, on the other hand, > > requires the caller to sample the file attributes of one of the files > > participating in the exchange, and aborts the exchange if that file has > > changed in the meantime (presumably by another thread). > > So per all the discussions, wouldn't it make sense to separate out > XFS_IOC_COMMIT_RANGE (plus the new start commit one later), and if > discussions are still going on just get XFS_IOC_EXCHANGE_RANGE > done ASAP to go on with online repair, and give XFS_IOC_COMMIT_RANGE > enough time to discuss the finer details? Done. All of the COMMIT_RANGE code (and the _CHECK_FRESH2 code) have been moved to a separate patch and combined with the patch[1] from yesterday. [1] https://lore.kernel.org/linux-xfs/CAOQ4uxiPfno-Hx+fH3LEN_4D6HQgyMAySRNCU=O2R_-ksrxSDQ@xxxxxxxxxxxxxx/ > > +struct xfs_exch_range { > > + __s64 file1_fd; > > I should have noticed this last time, by why are we passing a fd > as a 64-bit value when it actually just is a 32-bit value in syscalls? > (same for commit). I'll change that. > > + if (((fxr->file1->f_flags | fxr->file2->f_flags) & (__O_SYNC | O_DSYNC)) || > > Nit: overly long line here. I'll replace the __O_SYNC | O_DSYNC with O_SYNC, since they're the same thing. > > + if (fxr->flags & ~(XFS_EXCHRANGE_ALL_FLAGS | __XFS_EXCHRANGE_CHECK_FRESH2)) > > .. and here Fixed, thanks. > > + /* > > + * The ioctl enforces that src and dest files are on the same mount. > > + * However, they only need to be on the same file system. > > + */ > > + if (inode1->i_sb != inode2->i_sb) > > + return -EXDEV; > > How about only doing this checks once further up? As the same sb also > applies the same mount. I'll remove this check entirely, since we've already checked that the vfsmnt are the same. Assuming that's what you meant-- I was slightly confused by "same sb also applies the same mount" and decided to interpret that as "same sb implies the same mount". --D