Re: How to reliably measure fs usage with reflinks enabled?

Mike Fleetwood <mike.fleetwood@xxxxxxxxxxxxxx> · Fri, 18 May 2018 15:43:13 +0100

(Sorry for the late reply, work commitments)

On 16 May 2018 at 01:13, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Tue, May 15, 2018 at 02:52:30PM +0100, Mike Fleetwood wrote:
>> On 15 May 2018 at 02:29, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>> > So the reflink code reserved ~7GB of space in the filesystem (less
>> > than 1%) for it's own reflink related metadata if it ever needs it.
>> > It hasn't used it yet but we need to make sure that it's available
>> > when the filesystem is near ENOSPC. Hence it's considered used space
>> > because users cannot store user data in that space.
>> >
>> > The change I plan to make is to reduce the user reported filesystem
>> > size rather than account for it as used space. IOWs, you'd see a
>> > filesystem size of 889G instead of 896G, but have only 8.8GB used.
>> > It means exactly the same thingi and will behave exactly the same
>> > way, it's just a different space accounting technique....
>>
>> I'm one of the authors of GParted and it uses the reported file system
>> size [1] and compares it to the block device size to see if the file
>> system fills the partition or not and whether to show unallocated space
>> to the user and advise them to grown the file system to fill the block
>> device [2].  As such we prefer that the reported size of the file system
>> match the highest offset that the file system can write to in the block
>> device.
>
> I think that's a narrow, use case specific assumption. There is
> absolutely no guarantee that the filesystem on a device fills the
> entire device or that the filesystem space reported by df/statvfs
> accurately reflects the size of the underlying block device.
>
> Filesystems are moving towards a virtualised world where space usage
> and capacity is kept separate from the capacity of the underlying
> storage provider. That's a solid direction we are moving with xfs:
>
> https://www.spinics.net/lists/linux-xfs/msg12216.html
>
> so we can support subvolumes:
>
> https://www.youtube.com/watch?v=wG8FUvSGROw
>
> via a virtual block address space that remaps the filesystem space
> accounting away from the underlying physical block device:
>
> https://lwn.net/SubscriberLink/753650/32230c15f3453808/
>
> This will completely break any assumption that the filesystem size
> is related to the underlying storage device(s).
>
> GParted deals very firmly with a specific aspect of disk based
> storage - managing partitions on a physical block device.
> Filesystems need to move beyond physical block devices - sanely
> supporting sparse virtual block devices has been on everyone's
> enterprise filesystem wish list for years.

Agreed that GParted is a tool for simple storage setups with current
full fat block devices and file systems.  As such enterprise users with
multiple levels in their storage stack is not it's target audience.

> GParted doesn't have to support these new features - it can simply
> turn them off for filesystems it creates on physical disk
> partitions, but we're doing stuff to support the storage models
> needed for container hosting, virtualisation, efficient backups and
> cloning, etc. If that means we have to break assumptions that legacy
> infrastructure make to support those new features, then so be it....
>
> <snip>
>
>> [2] For full disclosure, because tools for various FSs under report
>>     their file system size, there is a heuristic that there must be at
>>     least 2% difference before unallocated space and grow file system
>>     recommendation is generated so under reporting the FS size by less
>>     than 1% wouldn't actually be an issue. for us.
>
> So, an ext3 example on a small root filesystem:
>
> $ grep sda1 /proc/partitions
>    8        1    9984366 sda1
> $ df -k /
> Filesystem     1K-blocks    Used Available Use% Mounted on
> /dev/root        9696448 8615892    581340  94% /
> $
>
> Just under 3% difference between fs reported size and the block
> device size, and obviously GParted has been fine with this sort of
> discrepancy on ext3 for the past 15+years. IIRC the XFS metadata
> reservations max out at around 3% of total filesystem space, so
> GParted should be just fine with us hiding them by reducing total
> filesystem size...

(I assume you are aware, but for completeness ...)
By default ext2/4 kernel code subtracts some overhead blocks from the
statvfs reported f_blocks figure.  This is documented in mount(8)
against the bsddf/minixdf options.

So after checking, GParted was modified to use the dumpe2fs command to
read the superblock to get the file system size for mounted ext* file
systems too.

https://marc.info/?l=linux-ext4&m=134706477618732&w=2

I see that xfs_db doesn't allow reading the super block of mounted XFS
file systems.  So for the case of a mounted XFS on full fat block device
I guess I'll wait and see how much overhead is subtracted from the
statvfs f_blocks figure and make sure GParted accounts for that.

>> Just providing an app authors point of view.
>
> *nod*.
>
> We're aware that we need to let existing apps continue to work on
> existing formats and features. But we need to break from the old
> ways to do what people are asking us to do, so we're not going to
> lock ourselves in. If we're not breaking old things and making
> people unhappy, then we're not making sufficient progress.

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html