Hi David, On 4/7/2021 7:43 PM, David Orman wrote:
Now that the hybrid allocator appears to be enabled by default in Octopus, is it safe to change bluestore_min_alloc_size_hdd to 4k from 64k on Octopus 15.2.10 clusters, and then redeploy every OSD to switch to the smaller allocation size, without massive performance impact for RBD? We're seeing a lot of storage usage amplification on EC 8+3 clusters which are HDD backed that lines up with a lot of the mailing list posts we've seen here. Upgrading to Pacific before making this change is also a possibility once a more stable release arrives, if that's necessary.
I wouldn't recommend switching to 4K min alloc size for pre-Pacific cluesters. Additional fixes besides Hybrid Allocator are required to avoid performance degradation.
And we decided not to backport that changes to Octopus as they look too complicated.
Second part of this question - we are using RBDs currently on the clusters impacted. These have XFS filesystems on top, which detect the sector size of the RBD as 512byte, and XFS has a block size of 4k. With the default of 64k for bluestore_min_alloc_size_hdd, let's say a 1G file is written out to the XFS filesystem backed by the RBD. On the ceph side, is this being seen as a lot of 4k objects thus a significant space waste is occurring, or is RBD able to coalesce these into 64k objects, even though XFS is using a 4k block size? XFS details below, you can see the allocation groups are quite large: meta-data=/dev/rbd0 isize=512 agcount=501, agsize=268435440 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 data = bsize=4096 blocks=134217728000, imaxpct=1 = sunit=16 swidth=16 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=16 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 I'm curious if people have been tuning XFS on RBD for better performance, as well.
I presume that actual writing blocks are determined primarily by the application - e.g. whether buffered/direct I/O is in use and how often flush/sync calls are made.
Speculating rather than know for sure though....
Thank you! _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx