On Wed, Sep 18, 2024 at 08:59:41AM +0100, John Garry wrote: > On 17/09/2024 23:12, Dave Chinner wrote: > > On Mon, Sep 16, 2024 at 11:24:56AM +0100, John Garry wrote: > > > > Hence we'll eventually end > > > > up with atomic writes needing to be enabled at mkfs time, but force > > > > align will be an upgradeable feature flag. > > > > > > Could atomic writes also be an upgradeable feature? We just need to ensure > > > that agsize % extsize == 0 for an inode enabled for atomic writes. > > > > To turn the superblock feature bit on, we have to check the AGs are > > correctly aligned to the *underlying hardware*. If they aren't > > correctly aligned (and there is a good chance they will not be) > > then we can't enable atomic writes at all. The only way to change > > this is to physically move AGs around in the block device (i.e. via > > xfs_expand tool I proposed). > > > i.e. the mkfs dependency on having the AGs aligned to the underlying > > atomic write capabilities of the block device never goes away, even > > if we want to make the feature dynamically enabled. > > > > IOWs, yes, an existing filesystem -could- be upgradeable, but there > > is no guarantee that is will be. > > > > Quite frankly, we aren't going to see block devices that filesystems > > already exist on suddenly sprout support for atomic writes mid-life. > > I would not be so sure. Some SCSI devices used in production which I know > implicitly write 32KB atomically. And we would like to use them for atomic > writes. Ok, but that's not going to be widespread. Very little storage hardware out there supports atomic writes - the vast majority of deployments will be new hardware that will have mkfs run on it. A better argument for dynamic upgrade is turning on atomic writes on reflink enabled filesystems once the kernel implementation has been updates to allow the two features to co-exist. > 32KB is small and I guess that there is a small chance of > pre-existing AGs not being 32KB aligned. I would need to check if there is > even a min alignment for AGs... There is no default alignment for AGs unless there is a stripe unit set. Then it will align AGs to the stripe unit. There is also no guarantee that stripe units are aligned to powers of two or atomic write units... > > Hence if mkfs detects atomic write support in the underlying device, > > it should *always* modify the geometry to be compatible with atomic > > writes and enable atomic write support. > > The current solution is to enable via commandline. Yes, that's the current proposal. What I'm saying is that this isn't a future proof solution, nor how we want this functionality to work in the future. We should be looking at the block device capabilities (like we do for stripe unit, etc) and then *do the right thing automatically*. If the block device advertises atomic write support, then we should automatically align the filesystem to atomic write constraints, even if atomic writes can not be immediately enabled (because reflink). I'm trying to describe how we want things to work once atomic write support is ubiquitous. It needs to be simple for users and admins, and it should work (or be reliably upgradeable) out of the box on all new hardware that supports this functionality. > > Yes, that means the "incompat with reflink" issue needs to be fixed > > before we take atomic writes out of experimental (i.e. we consistently > > apply the same "full support" criteria we applied to DAX). > > In the meantime, if mkfs auto-enables atomic writes (when the HW supports), > what will it do to reflink feature (in terms of enabling)? I didn't say we should always "auto-enable atomic writes". I said if the hardware is atomic write capable, then mkfs should always *align the filesystem* to atomic write constraints. A kernel upgrade will eventually allow reflink and atomic writes to co-exist, but only if the filesystem is correctly aligned to the hardware constrains for atomic writes. We need to ensure we leave that upgrade path open.... .... and only once we have full support can we make "mkfs auto-enable atomic writes". -Dave. -- Dave Chinner david@xxxxxxxxxxxxx