Hey Wladimir, I actually don't know where this is referenced in the docs, if anywhere. Googling around shows many people discovering this overhead the hard way on ceph-users. I also don't know the rbd journaling mechanism in enough depth to comment on whether it could be causing this issue for you. Are you seeing a high allocated:stored ratio on your cluster? Josh On Sun, Jul 4, 2021 at 6:52 AM Wladimir Mutel <mwg@xxxxxxxxx> wrote: > Dear Mr Baergen, > > thanks a lot for your very concise explanation, > however I would like to learn more why default Bluestore alloc.size causes > such a big storage overhead, > and where in the Ceph docs it is explained how and what to watch for to > avoid hitting this phenomenon again and again. > I have a feeling this is what I get on my experimental Ceph setup with > simplest JErasure 2+1 data pool. > Could it be caused by journaled RBD writes to EC data-pool ? > > Josh Baergen wrote: > > Hey Arkadiy, > > > > If the OSDs are on HDDs and were created with the default > > bluestore_min_alloc_size_hdd, which is still 64KiB in Octopus, then in > > effect data will be allocated from the pool in 640KiB chunks (64KiB * > > (k+m)). 5.36M objects taking up 501GiB is an average object size of 98KiB > > which results in a ratio of 6.53:1 allocated:stored, which is pretty > close > > to the 7:1 observed. > > > > If my assumption about your configuration is correct, then the only way > to > > fix this is to adjust bluestore_min_alloc_size_hdd and recreate all your > > OSDs, which will take a while... > > > > Josh > > > > On Tue, Jun 29, 2021 at 3:07 PM Arkadiy Kulev <eth@xxxxxxxxxxxx> wrote: > > > >> The pool *default.rgw.buckets.data* has *501 GiB* stored, but USED shows > >> *3.5 > >> TiB *(7 times higher!)*:* > >> > >> root@ceph-01:~# ceph df > >> --- RAW STORAGE --- > >> CLASS SIZE AVAIL USED RAW USED %RAW USED > >> hdd 196 TiB 193 TiB 3.5 TiB 3.6 TiB 1.85 > >> TOTAL 196 TiB 193 TiB 3.5 TiB 3.6 TiB 1.85 > >> > >> --- POOLS --- > >> POOL ID PGS STORED OBJECTS USED %USED > MAX > >> AVAIL > >> device_health_metrics 1 1 19 KiB 12 56 KiB 0 > >> 61 TiB > >> .rgw.root 2 32 2.6 KiB 6 1.1 MiB 0 > >> 61 TiB > >> default.rgw.log 3 32 168 KiB 210 13 MiB 0 > >> 61 TiB > >> default.rgw.control 4 32 0 B 8 0 B 0 > >> 61 TiB > >> default.rgw.meta 5 8 4.8 KiB 11 1.9 MiB 0 > >> 61 TiB > >> default.rgw.buckets.index 6 8 1.6 GiB 211 4.7 GiB 0 > >> 61 TiB > >> > >> default.rgw.buckets.data 10 128 501 GiB 5.36M 3.5 TiB 1.90 > >> 110 TiB > >> > >> The *default.rgw.buckets.data* pool is using erasure coding: > >> > >> root@ceph-01:~# ceph osd erasure-code-profile get EC_RGW_HOST > >> crush-device-class=hdd > >> crush-failure-domain=host > >> crush-root=default > >> jerasure-per-chunk-alignment=false > >> k=6 > >> m=4 > >> plugin=jerasure > >> technique=reed_sol_van > >> w=8 > >> > >> If anyone could help explain why it's using up 7 times more space, it > would > >> help a lot. Versioning is disabled. ceph version 15.2.13 (octopus > stable). > >> > >> Sincerely, > >> Ark. > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@xxxxxxx > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx