Re: [PATCH] xfs: allow changing extsize on file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 08, 2024 at 10:03:43AM -0700, Wengang Wang wrote:
> Hi Dave, this is more a question than a patch.
> 
> We are current disallowing the change of extsize on files/dirs if the file/dir
> have blocks allocated. That's not that friendly to users. Say somehow the
> extsize was set very huge (1GiB), in the following cases, it's not that

The first problem is ensuring that "say somehow extsize was set very
huge" doesn't happen in the first place. Then all the other problems
just don't happen.

> convenient:
> case 1: the file now extends very little. -- 1GiB extsize leads a waste of
>         almost 1GiB.
> case 2: when CoW happens, 1GiB is preallocated. 1GiB is now too big for the
>         IO pattern, so the huge preallocting and then reclaiming is not necessary
>         and that cost extra time especially when the system if fragmented.
> 
> In above cases, changing extsize smaller is needed.
> 
> In theory, the exthint is a hint for future allocation,

It's not that simple because future allocation is influenced by past
allocation. e.g. What happens if the new extent size hint is not
aligned with the old one and we now have two different extent
alignments in the file?

What happens if an admin sees this when trying to triage some
other problem and doesn't know that the extent size hint has been
changed? They'll think there is a bug in the filesystem allocator
and report it.

What do we do with that report now? Do we waste hours trying to
reproduce it and fail, maybe never learning that the an extent
size hint change caused the issue? i.e. how do we determine that the
issue is a real allocation alignment bug versus it simply being a
result of "application did something whacky with extent size hints"?

Hence allowing extent size hints to change dynamically basically
makes it impossible to trust that the current extent size hint
defines the alignment for all the extents in the file. And at that
point, we completely lose the ability to triage allocation alignment
issues without an exact reproducer from the reporter...

Now, just disabling extent size hints avoids this issue (i.e. allow
return to zero if extents already exist) because there's now no
alignment restriction at all and nobody is going to care. However,
this creates new issues.

e.g it opens up the possibility that applications will scan existing
files for extent size hints set on them and be able to -override the
admin set alignment hints- used to create the data set.

The admin may have set inheritable extent size hints to ensure
allocation alignment to underlying storage because the applications
don't know about optimal storage alignments (e.g. for PMD alignment
on DAX storage). We don't want applications to be able to disable
these hints because the precise reason they are set is to optimise
storage alignment for better application performance....

IOWs, there are good reasons for not allowing extent size hints to
be overrridden by applications just by clearing/changing the inode
extent size field...

> I can't connect it
> to the blocks which are already allocated to the file/dir.
> So the only reason why we disallow that is that there might be some problems if
> we allow it.  Well, can we fix the real problem(s) rather than disallowing
> extsize changing?

The only reliable way to change extent size hints so allocation
alignment always matches the new extent size hint is to physically
realign the data in the file to the new extent size hint. i.e. do it
through xfs_fsr to "defrag" the file according to the new extent
size hint. Then when we swap the old and new data extents, we also
set the new extent size hint that matches the new data extents.

This extent size hint change is then enabled through a completely
different interface which is not one applications will use in
general operation. Hence it becomes an explicit admin operation,
enabling users to rectify the rare problems you document above
without compromising the existing behaviour of extent size hints for
everyone else.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux