Hi John, Thanks for the patch, I've added some review comments and questions below. On Wed, Jan 24, 2024 at 02:26:43PM +0000, John Garry wrote: > Support providing info on atomic write unit min and max for an inode. > > For simplicity, currently we limit the min at the FS block size, but a > lower limit could be supported in future. > > The atomic write unit min and max is limited by the guaranteed extent > alignment for the inode. > > Signed-off-by: John Garry <john.g.garry@xxxxxxxxxx> > --- > fs/xfs/xfs_iops.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ > fs/xfs/xfs_iops.h | 4 ++++ > 2 files changed, 49 insertions(+) > > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c > index a0d77f5f512e..0890d2f70f4d 100644 > --- a/fs/xfs/xfs_iops.c > +++ b/fs/xfs/xfs_iops.c > @@ -546,6 +546,44 @@ xfs_stat_blksize( > return PAGE_SIZE; > } > > +void xfs_get_atomic_write_attr( > + struct xfs_inode *ip, > + unsigned int *unit_min, > + unsigned int *unit_max) > +{ > + xfs_extlen_t extsz = xfs_get_extsz(ip); > + struct xfs_buftarg *target = xfs_inode_buftarg(ip); > + struct block_device *bdev = target->bt_bdev; > + unsigned int awu_min, awu_max, align; > + struct request_queue *q = bdev->bd_queue; > + struct xfs_mount *mp = ip->i_mount; > + > + /* > + * Convert to multiples of the BLOCKSIZE (as we support a minimum > + * atomic write unit of BLOCKSIZE). > + */ > + awu_min = queue_atomic_write_unit_min_bytes(q); > + awu_max = queue_atomic_write_unit_max_bytes(q); > + > + awu_min &= ~mp->m_blockmask; > + awu_max &= ~mp->m_blockmask; I don't understand why we try to round down the awu_max to blocks size here and not just have an explicit check of (awu_max < blocksize). I think the issue with changing the awu_max is that we are using awu_max to also indirectly reflect the alignment so as to ensure we don't cross atomic boundaries set by the hw (eg we check uint_max % atomic alignment == 0 in scsi). So once we change the awu_max, there's a chance that even if an atomic write aligns to the new awu_max it still doesn't have the right alignment and fails. It works right now since eveything is power of 2 but it should cause issues incase we decide to remove that limitation. Anyways, I think this implicit behavior of things working since eveything is a power of 2 should atleast be documented in a comment, so these things are immediately clear. > + > + align = XFS_FSB_TO_B(mp, extsz); > + > + if (!awu_max || !xfs_inode_atomicwrites(ip) || !align || > + !is_power_of_2(align)) { Correct me if I'm wrong but here as well, the is_power_of_2(align) is esentially checking if the align % uinit_max == 0 (or vice versa if unit_max is greater) so that an allocation of extsize will always align nicely as needed by the device. So maybe we should use the % expression explicitly so that the intention is immediately clear. > + *unit_min = 0; > + *unit_max = 0; > + } else { > + if (awu_min) > + *unit_min = min(awu_min, align); How will the min() here work? If awu_min is the minumum set by the device, how can statx be allowed to advertise something smaller than that? If I understand correctly, right now the way we set awu_min in scsi and nvme, the follwoing should usually be true for a sane device: awu_min <= blocks size of fs <= align so the min() anyways becomes redundant, but if we do assume that there might be some weird devices with awu_min absurdly large (SCSI with high atomic granularity) we still can't actually advertise a min smaller than that of the device, or am I missing something here? > + else > + *unit_min = mp->m_sb.sb_blocksize; > + > + *unit_max = min(awu_max, align); > + } > +} > + Regards, ojaswin