This appears to be exactly the issue. After testing with xfs mounted with the option 'allocsize=64k' and even trying ext4, the extraneous filesystem usage disappears. Removing 'allocsize=64k' returns the additional space usage. The linked article also makes reference to /proc/sys/fs/xfs/speculative_prealloc_lifetime to show the lifetime of the allocation which matches with behavior I was seeing.
Thank you for this most excellent find!
On Wed, Feb 9, 2022 at 10:01 AM Xavi Hernandez <jahernan@xxxxxxxxxx> wrote:
Hi,this problem is most likely caused by the XFS speculative preallocation (https://linux-xfs.oss.sgi.narkive.com/jjjfnyI1/faq-xfs-speculative-preallocation)Regards,Xavi-------On Sat, Feb 5, 2022 at 10:19 AM Strahil Nikolov <hunter86_bg@xxxxxxxxx> wrote:It seems quite odd.
I'm adding the devel list,as it looks like a bug - but it could be a feature ;)
Best Regards,
Strahil Nikolov
________----- Препратено съобщение -----От: Fox <foxxz.net@xxxxxxxxx>До: Gluster Users <gluster-users@xxxxxxxxxxx>Изпратено: събота, 5 февруари 2022 г., 05:39:36 Гринуич+2Тема: Re: [Gluster-users] Distributed-Disperse Shard BehaviorI tried setting the shard size to 512MB. It slightly improved the space utilization during creation - not quite double space utilization. And I didn't run out of space creating a file that occupied 6gb of the 8gb volume (and I even tried 7168MB just fine). See attached command line log.On Fri, Feb 4, 2022 at 6:59 PM Strahil Nikolov <hunter86_bg@xxxxxxxxx> wrote:It sounds like a bug to me.In virtualization sharding is quite common (yet, on replica volumes) and I have never observed such behavior.Can you increase the shard size to 512M and check if the situation is better ?Also, share the volume info.Best Regards,Strahil Nikolov
On Fri, Feb 4, 2022 at 22:32, Fox<foxxz.net@xxxxxxxxx> wrote:________Using gluster v10.1 and creating a Distributed-Dispersed volume with sharding enabled.I create a 2gb file on the volume using the 'dd' tool. The file size shows 2gb with 'ls'. However, 'df' shows 4gb of space utilized on the volume. After several minutes the volume utilization drops to the 2gb I would expect.This is repeatable for different large file sizes and different disperse/redundancy brick configurations.I've also encountered a situation, as configured above, where I utilize close to full disk capacity and am momentarily unable to delete the file.I have attached a command line log of an example of above using a set of test VMs setup in a glusterfs cluster.Is this initial 2x space utilization anticipated behavior for sharding?It would mean that I can never create a file bigger than half my volume size as I get an I/O error with no space left on disk.
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
________
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel
------- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-devel