Hi Igor, I think so because 1) space usage increases after each rebalance. Even when the same pg is moved twice (!) 2) I use 4k min_alloc_size from the beginning One crazy hypothesis is that maybe ceph allocates space for uncompressed objects, then compresses them and leaks (uncompressed-compressed) space. Really crazy idea but who knows o_O. I already did a deep fsck, it didn't help... what else could I check?... 26 марта 2020 г. 1:40:52 GMT+03:00, Igor Fedotov <ifedotov@xxxxxxx> пишет: >Bluestore fsck/repair detect and fix leaks at Bluestore level but I >doubt your issue is here. > >To be honest I don't understand from the overview why do you think that > >there are any leaks at all.... > >Not sure whether this is relevant but from my experience space "leaks" >are sometimes caused by 64K allocation unit and keeping tons of small >files or massive small EC overwrites. > >To verify if this is applicable you might want to inspect bluestore >performance counters (bluestore_stored vs. bluestore_allocated) to >estimate your losses due to high allocation units. > >Significant difference at multiple OSDs might indicate that overhead is > >caused by high allocation granularity. Compression might make this >analysis not that simple though... > > >Thanks, > >Igor > > >On 3/26/2020 1:19 AM, vitalif@xxxxxxxxxx wrote: >> I have a question regarding this problem - is it possible to rebuild >> bluestore allocation metadata? I could try it to test if it's an >> allocator problem... >> >>> Hi. >>> >>> I'm experiencing some kind of a space leak in Bluestore. I use EC, >>> compression and snapshots. First I thought that the leak was caused >by >>> "virtual clones" (issue #38184). However, then I got rid of most of >>> the snapshots, but continued to experience the problem. >>> >>> I suspected something when I added a new disk to the cluster and >free >>> space in the cluster didn't increase (!). >>> >>> So to track down the issue I moved one PG (34.1a) using upmaps from >>> osd11,6,0 to osd6,0,7 and then back to osd11,6,0. >>> >>> It ate +59 GB after the first move and +51 GB after the second. As I >>> understand this proves that it's not #38184. Devirtualizaton of >>> virtual clones couldn't eat additional space after SECOND rebalance >of >>> the same PG. >>> >>> The PG has ~39000 objects, it is EC 2+1 and the compression is >>> enabled. Compression ratio is about ~2.7 in my setup, so the PG >should >>> use ~90 GB raw space. >>> >>> Before and after moving the PG I stopped osd0, mounted it with >>> ceph-objectstore-tool with debug bluestore = 20/20 and opened the >>> 34.1a***/all directory. It seems to dump all object extents into the >>> log in that case. So now I have two logs with all allocated extents >>> for osd0 (I hope all extents are there). I parsed both logs and >added >>> all compressed blob sizes together ("get_ref Blob ... 0x20000 -> >0x... >>> compressed"). But they add up to ~39 GB before first rebalance >>> (34.1as2), ~22 GB after it (34.1as1) and ~41 GB again after the >second >>> move (34.1as2) which doesn't indicate a leak. >>> >>> But the raw space usage still exceeds initial by a lot. So it's >clear >>> that there's a leak somewhere. >>> >>> What additional details can I provide for you to identify the bug? >>> >>> I posted the same message in the issue tracker, >>> https://tracker.ceph.com/issues/44731 -- With best regards, Vitaliy Filippov _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx