Re: [PATCH v3 4/7] xfs: don't bump the i_version on an atime update in xfs_vn_update_time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2022-08-30 at 10:08 +1000, Dave Chinner wrote:
> On Mon, Aug 29, 2022 at 06:33:48AM -0400, Jeff Layton wrote:
> > On Mon, 2022-08-29 at 15:48 +1000, Dave Chinner wrote:
> > > > 
> > > > The race window ought to be relatively small, and this wouldn't result
> > > > in incorrect behavior that you'd notice (other than loss of
> > > > performance), but it's not ideal. We're doing more on-the-wire reads
> > > > than are necessary in this case.
> > > > 
> > > > It would be nice to have it not do that. If we end up taking this patch
> > > > to make it elide the i_version bumps on atime updates, we may be able to
> > > > set the the NOIVER flag in other cases as well, and avoid some of these
> > > > extra bumps.
> > > 
> > > 
> > > <sigh>
> > > 
> > > Please don't make me repeat myself for the third time.
> > > 
> > > Once we have decided on a solid, unchanging definition for the
> > > *statx user API variable*, we'll implement a new on-disk field that
> > > provides this information.  We will document it in the on-disk
> > > specification as "this is how di_iversion behaves" so that it is
> > > clear to everyone parsing the on-disk format or writing their own
> > > XFS driver how to implement it and when to expect it to
> > > change.
> > > 
> > > Then we can add a filesystem and inode feature flags that say "inode
> > > has new iversion" and we use that to populate the kernel iversion
> > > instead of di_changecount. We keep di_changecount exactly the way it
> > > is now for the applications and use cases we already have for that
> > > specific behaviour. If the kernel and/or filesystem don't support
> > > the new di_iversion field, then we'll use di_changecount as it
> > > currently exists for the kernel iversion code.
> > > 
> > 
> > Aside from NFS and IMA, what applications are dependent on the current
> > definition and how do they rely on i_version today?
> 
> I've answered this multiple times already: the di_changecount
> behaviour is defined in the on-disk specification and hence we
> *cannot change the behaviour* without changing the on-disk format
> specification.
> 
> Apart from the forensics aspect of the change counter (which nobody
> but us XFS developers seem to understand just how damn important
> this is), there are *many* third party applications that parse the
> XFS on-disk format directly. This:
> 
> https://codesearch.debian.net/search?q=XFS_SB_VERSION_DIRV2&literal=1
> 
> Shows grub2, libparted, syslinux, partclone and fsarchiver as
> knowing about XFS on-disk superblock flags that tell them what
> format the directory structure is in. That alone is enough to
> indicate they parse on-disk inodes directly, and hence may expect
> di_changecount to have specific meaning and use it to detect
> unexpected changes to files/directories they care about.
> 
> If I go looking for XFS_SB_MAGIC, I find things like libblkid,
> klibc, qemu, Xen, testdisk, gpart, and virtualbox all parse the
> on-disk superblocks directly from the block device, too. They also
> rely directly on XFS developers ensuring there are no silent
> incomaptible changes to the on disk format.
> 
> I also know of many other utilities that people and companies have
> written that parse the on disk format directly from userspace. The
> functions these perform include low level storage management tools,
> copying and managing disk images (e.g. offline configuration for
> cluster deployments), data recovery tools that scrape all the data
> out of broken filesystems, etc.
> 
> These applications are reliant on the guarantee we provide that the
> on-disk format will not silently change and that behaviour/structure
> can always easily be discovered by feature flags in the superblock
> and/or inodes.
> 
> IOWs, just because there aren't obvious "traditional" application on
> top of the kernel filesystem that consumes the in-memory kernel
> iversion field, it does not mean that the defined behaviour of the
> on-disk di_changecount field is not used or relied on by other tools
> that work directly on the on-disk format.
> 
> You might be right that NFS doesn't care about this, but the point
> remains that NFS does not control the XFS on-disk format, nor does
> the fact that what NFS wants from the change attribute has changed
> over time override the fact that maintaining XFS on-disk format
> compatibility is the responsibility of XFS developers. We're willing
> to change the on-disk format to support whatever the new definition
> of the statx change attribute ends up being, and that should be the
> end of the discussion.
> 

Thanks for spelling this out in more detail.

> > > Keep in mind that we've been doing dynamic inode format updates in
> > > XFS for a couple of decades - users don't even have to be aware that
> > > they need to perform format upgrades because often they just happen
> > > whenever an inode is accessed. IOWs, just because we have to change
> > > the on-disk format to support this new iversion definition, it
> > > doesn't mean users have to reformat filesystems before the new
> > > feature can be used.
> > > 
> > > Hence, over time, as distros update kernels, the XFS iversion
> > > behaviour will change automagically as we update inodes in existing
> > > filesystems as they are accessed to add and then use the new
> > > di_iversion field for the VFS change attribute field instead of the
> > > di_changecount field...
> > > 
> > 
> > If you want to create a whole new on-disk field for this, then that's
> > your prerogative, but before you do that, I'd like to better understand
> > why and how the constraints on this field changed.
> > 
> > The original log message from the commit that added a change counter
> > (below) stated that you were adding it for network filesystems like NFS.
> > When did this change and why?
> 
> It never changed. I'll repeat what I've already explained twice
> before:
> 
> https://lore.kernel.org/linux-xfs/20220818030048.GE3600936@xxxxxxxxxxxxxxxxxxx/
> https://lore.kernel.org/linux-xfs/20220818033731.GF3600936@xxxxxxxxxxxxxxxxxxx/
> 
> tl; dr: NFS requirements were just one of *many* we had at the time
> for an atomic persistent change counter.
> 
> The fact is that NFS users are just going to have to put up with
> random cache invalidations on XFS for a while longer. Nobody noticed
> this and/or cared about this enough to raise it as an issue for the
> past decade, so waiting another few months for upstream XFS to
> change to a different on-disk format for the NFS/statx change
> attribute isn't a big deal.
> 

Fair enough. I'll plan to drop this patch from the series for now, with
the expectation that you guys will add a new i_version counter that
better conforms to what NFS and IMA need.

Thanks,
-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux