During conversion (or fsck), I stopped all other OSDs I do not have enought main memory to run that kind of process as well as the OSDs osd.0 is a 6TB rusty device, fsck eats 35GB of memory I have other rusty devices on that host: 3TB and 10TB Best regards, On 4/3/20 12:50 PM, Igor Fedotov wrote: > Thanks, Jack. > > One more question please - what's the actual maximum memory consumption > for this specific OSD during fsck? > > And is it backed by 3, 6 or 10 TB drive ? > > > Regards, > > Igor > > On 4/2/2020 7:15 PM, Jack wrote: >> I do compress: >> root@backup2:~# ceph daemon osd.0 config show | grep >> bluestore_compression >> "bluestore_compression_algorithm": "snappy", >> "bluestore_compression_max_blob_size": "0", >> "bluestore_compression_max_blob_size_hdd": "524288", >> "bluestore_compression_max_blob_size_ssd": "65536", >> "bluestore_compression_min_blob_size": "0", >> "bluestore_compression_min_blob_size_hdd": "8192", >> "bluestore_compression_min_blob_size_ssd": "8192", >> "bluestore_compression_mode": "force", >> "bluestore_compression_required_ratio": "0.955000", >> >> I will deal with the memory consumption >> After all, it just require more time (starting OSD one by one), and it >> still fits in my main memory >> >> Thank you for checkout out the issue >> >> >> On 4/2/20 5:28 PM, Igor Fedotov wrote: >>> So this OSD has 32M of shared blobs and fsck loads them all into memory >>> while processing. Hence the RAM consumption. >>> >>> >>> I'm afraid there is no simple way to fix that, will create a ticket >>> though. >>> >>> >>> And a side question: >>> >>> 1) Do you use erasure coding and/or compression for rbd pool? >>> >>> These stats look suspicious >>> >>> POOL ID STORED (DATA) (OMAP) OBJECTS >>> USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES >>> DIRTY USED COMPR UNDER COMPR >>> rbd 1 245 TiB 245 TiB 9.0 MiB 50.26M 151 >>> TiB 151 TiB 9.0 MiB 90.03 12 TiB N/A N/A 50.26M >>> 35 TiB 144 TiB >>> >>> Stored - 245 TiB, Used - 151 TiB >>> >>> Can't imagine any explanation other than applied compression. >>> >>> >>> Thanks, >>> >>> Igor >>> >>> >>> >>> On 4/2/2020 5:59 PM, Jack wrote: >>>> Here it is >>>> >>>> On 4/2/20 3:48 PM, Igor Fedotov wrote: >>>>> And may I have the output for: >>>>> >>>>> ceph daemon osd.N calc_objectstore_db_histogram >>>>> >>>>> This will collect some stats on record types in OSD's DB. >>>>> >>>>> >>>>> On 4/2/2020 4:13 PM, Jack wrote: >>>>>> (fsck / quick-fix, same story) >>>>>> >>>>>> On 4/2/20 3:12 PM, Jack wrote: >>>>>>> Hi, >>>>>>> >>>>>>> A simple fsck eats the same amount of memory >>>>>>> >>>>>>> Cluster usage: rbd with a bit of rgw >>>>>>> >>>>>>> Here is the ceph df detail >>>>>>> All OSDs are single rusty devices >>>>>>> >>>>>>> On 4/2/20 2:19 PM, Igor Fedotov wrote: >>>>>>>> Hi Jack, >>>>>>>> >>>>>>>> could you please try the following - stop one of already converted >>>>>>>> OSDs >>>>>>>> and do a quick-fix/fsck/repair against it using >>>>>>>> ceph_bluestore_tool: >>>>>>>> >>>>>>>> ceph-bluestore-tool --path <path to osd> --command >>>>>>>> quick-fix|fsck|repair >>>>>>>> >>>>>>>> Does it cause similar memory usage? >>>>>>>> >>>>>>>> You can stop experimenting if quick-fix reproduces the issue. >>>>>>>> >>>>>>>> >>>>>>>> Also could you please describe your cluster and its usage a bit: >>>>>>>> what's >>>>>>>> the usage: rgw/rbd/cephfs? If possible - please share 'ceph df >>>>>>>> detail' >>>>>>>> output, do you have standalone DB volume at SSD/NVMe? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Igor >>>>>>>> >>>>>>>> >>>>>>>> On 4/1/2020 6:28 PM, Jack wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> As the upgrade documentation tells: >>>>>>>>>> Note that the first time each OSD starts, it will do a format >>>>>>>>>> conversion to improve the accounting for “omap” data. This may >>>>>>>>>> take a few minutes to as much as a few hours (for an HDD with >>>>>>>>>> lots >>>>>>>>>> of omap data). You can disable this automatic conversion with: >>>>>>>>> What the documentation does not say is that this process takes a >>>>>>>>> lot of >>>>>>>>> memory >>>>>>>>> >>>>>>>>> I am upgrading a rusty cluster from Nautilus, you can check out >>>>>>>>> the >>>>>>>>> ram >>>>>>>>> consumption as attachment >>>>>>>>> >>>>>>>>> First, we have a 3TB osd conversion: it tooks ~15min, and 19GB of >>>>>>>>> memory >>>>>>>>> >>>>>>>>> Then, we have a larger 6TB osd conversion: it tooks more than 2 >>>>>>>>> hours, >>>>>>>>> and 35GB of memory >>>>>>>>> >>>>>>>>> Finally, you have the largest 10TB osd: only 1H15, but 52GB of >>>>>>>>> memory >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx