On Mon, Jun 17, 2024 at 08:46:03AM +0200, Christoph Hellwig wrote: > On Mon, Jun 17, 2024 at 03:03:28PM +1000, Dave Chinner wrote: > > > That case should be covered by the XFS_IDIRTY_RELEASE, at least > > > except for O_SYNC workloads. > > > > Ah, so I fixed the problem independently 7 or 8 years later to fix > > Linux NFS server performance issues. Ok, that makes removing the > > flag less bad, but I still don't see the harm in keeping it there > > given that behaviour has existed for the past 20 years.... > > I'm really kinda worried about these unaccounted preallocations lingering > around basically forever. How are they "unaccounted"? They are accounted to the inode, they are visible in statx and so du reports them. Maybe you meant "unreclaimable"? But that's not true, either, because a truncate to the same size or a hole punch from EOF to -1 will remove the post-EOF blocks. But that's what the blockgc ioctls are supposed to be doing for these files, so.... > Note that in current mainline there actually > is a path removing them more or less accidentally when there are > delalloc blocks in a can_free_eofblocks path with force == true, > but that's going away with the next patch. ... fix the blockgc walk to ignore DIFLAG_APPEND when doing it's passes. The files are not marked with DIFLAG_PREALLOC, so blockgc should trim them, just like it does with all other files that have had post-eof prealloc that is currently unused. In short: Don't remove the optimisation that prevents worst case fragmentation in known workloads. Instead, fix the garbage collection to do the right thing when space is low and we are optimising for allocation success rather than optimal file layout. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx