On Tue, Apr 09, 2024 at 06:08:54PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > Introduce a variant on XFS_SCRUB_METADATA that allows for a vectored > mode. The caller specifies the principal metadata object that they want > to scrub (allocation group, inode, etc.) once, followed by an array of > scrub types they want called on that object. The kernel runs the scrub > operations and writes the output flags and errno code to the > corresponding array element. > > A new pseudo scrub type BARRIER is introduced to force the kernel to > return to userspace if any corruptions have been found when scrubbing > the previous scrub types in the array. This enables userspace to > schedule, for example, the sequence: > > 1. data fork > 2. barrier > 3. directory > > If the data fork scrub is clean, then the kernel will perform the > directory scrub. If not, the barrier in 2 will exit back to userspace. > > When running fstests in "rebuild all metadata after each test" mode, I > observed a 10% reduction in runtime due to fewer transitions across the > system call boundary. Just curius: what is the benefit over shaving a scruball $OBJECT interface where the above order is encoded in the kernel instead of in the scrub tool? > + BUILD_BUG_ON(sizeof(struct xfs_scrub_vec_head) == > + sizeof(struct xfs_scrub_metadata)); > + BUILD_BUG_ON(XFS_IOC_SCRUB_METADATA == XFS_IOC_SCRUBV_METADATA); What is the point of these BUILD_BUG_ONs? > + if (copy_from_user(&head, uhead, sizeof(head))) > + return -EFAULT; > + > + if (head.svh_reserved) > + return -EINVAL; > + > + bytes = sizeof_xfs_scrub_vec(head.svh_nr); > + if (bytes > PAGE_SIZE) > + return -ENOMEM; > + vhead = kvmalloc(bytes, GFP_KERNEL | __GFP_RETRY_MAYFAIL); Why __GFP_RETRY_MAYFAIL and not just a plain GFP_KERNEL? > + if (!vhead) > + return -ENOMEM; > + memcpy(vhead, &head, sizeof(struct xfs_scrub_vec_head)); > + > + if (copy_from_user(&vhead->svh_vecs, &uhead->svh_vecs, > + head.svh_nr * sizeof(struct xfs_scrub_vec))) { This should probably use array_size to better deal with overflows. And maybe it should use an indirection for the vecs so that we can simply do a memdup_user to copy the entire array to kernel space?