On Wed, Oct 04, 2023 at 10:34:13AM -0700, Bart Van Assche wrote: > On 10/4/23 02:14, John Garry wrote: > > On 03/10/2023 17:45, Bart Van Assche wrote: > > > On 10/3/23 01:37, John Garry wrote: > > > > I don't think that is_power_of_2(write length) is specific to XFS. > > > > > > I think this is specific to XFS. Can you show me the F2FS code that > > > restricts the length of an atomic write to a power of two? I haven't > > > found it. The only power-of-two check that I found in F2FS is the > > > following (maybe I overlooked something): > > > > > > $ git grep -nH is_power fs/f2fs > > > fs/f2fs/super.c:3914: if (!is_power_of_2(zone_sectors)) { > > > > Any usecases which we know of requires a power-of-2 block size. > > > > Do you know of a requirement for other sizes? Or are you concerned that > > it is unnecessarily restrictive? > > > > We have to deal with HW features like atomic write boundary and FS > > restrictions like extent and stripe alignment transparent, which are > > almost always powers-of-2, so naturally we would want to work with > > powers-of-2 for atomic write sizes. > > > > The power-of-2 stuff could be dropped if that is what people want. > > However we still want to provide a set of rules to the user to make > > those HW and FS features mentioned transparent to the user. > > Hi John, > > My concern is that the power-of-2 requirements are only needed for > traditional filesystems and not for log-structured filesystems (BTRFS, > F2FS, BCACHEFS). Filesystems that support copy-on-write data (needed for arbitrary filesystem block aligned RWF_ATOMIC support) are not necessarily log structured. For example: XFS. All three of the filesystems you list above still use power-of-2 block sizes for most of their metadata structures and for large data extents. Hence once you go above a certain file size they are going to be doing full power-of-2 block size aligned IO anyway. hence the constraint of atomic writes needing to be power-of-2 block size aligned to avoid RMW cycles doesn't really change for these filesystems. In which case, they can just set their minimum atomic IO size to be the same as their block size (e.g. 4kB) and set the maximum to something they can guarantee gets COW'd in a single atomic transaction. What the hardware can do with REQ_ATOMIC IO is completely irrelevant at this point.... > What I'd like to see is that each filesystem declares its atomic write > requirements (in struct address_space_operations?) and that > blkdev_atomic_write_valid() checks the filesystem-specific atomic write > requirements. That seems unworkable to me - IO constraints propagate from the bottom up, not from the top down. Consider multi-device filesystems (btrfs and XFS), where different devices might have different atomic write parameters. Which set of bdev parameters does the filesystem report to the querying bdev? (And doesn't that question just sound completely wrong?) It also doesn't work for filesystems that can configure extent allocation alignment at an individual inode level (like XFS) - what does the filesystem report to the device when it doesn't know what alignment constraints individual on-disk inodes might be using? That's why statx() vectors through filesystems to all them to set their own parameters based on the inode statx() is being called on. If the filesystem has a native RWF_ATOMIC implementation, it can put it's own parameters in the statx min/max atomic write size fields. If the fs doesn't have it's own native support, but can do physical file offset/LBA alignment, then it publishes the block device atomic support parameters or overrides them with it's internal allocation alignment constraints. If the bdev doesn't support REQ_ATOMIC, the filesystem says "atomic writes are not supported". -Dave. -- Dave Chinner david@xxxxxxxxxxxxx