I do compress: root@backup2:~# ceph daemon osd.0 config show | grep bluestore_compression "bluestore_compression_algorithm": "snappy", "bluestore_compression_max_blob_size": "0", "bluestore_compression_max_blob_size_hdd": "524288", "bluestore_compression_max_blob_size_ssd": "65536", "bluestore_compression_min_blob_size": "0", "bluestore_compression_min_blob_size_hdd": "8192", "bluestore_compression_min_blob_size_ssd": "8192", "bluestore_compression_mode": "force", "bluestore_compression_required_ratio": "0.955000", I will deal with the memory consumption After all, it just require more time (starting OSD one by one), and it still fits in my main memory Thank you for checkout out the issue On 4/2/20 5:28 PM, Igor Fedotov wrote: > So this OSD has 32M of shared blobs and fsck loads them all into memory > while processing. Hence the RAM consumption. > > > I'm afraid there is no simple way to fix that, will create a ticket though. > > > And a side question: > > 1) Do you use erasure coding and/or compression for rbd pool? > > These stats look suspicious > > POOL ID STORED (DATA) (OMAP) OBJECTS > USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES > DIRTY USED COMPR UNDER COMPR > rbd 1 245 TiB 245 TiB 9.0 MiB 50.26M 151 > TiB 151 TiB 9.0 MiB 90.03 12 TiB N/A N/A 50.26M > 35 TiB 144 TiB > > Stored - 245 TiB, Used - 151 TiB > > Can't imagine any explanation other than applied compression. > > > Thanks, > > Igor > > > > On 4/2/2020 5:59 PM, Jack wrote: >> Here it is >> >> On 4/2/20 3:48 PM, Igor Fedotov wrote: >>> And may I have the output for: >>> >>> ceph daemon osd.N calc_objectstore_db_histogram >>> >>> This will collect some stats on record types in OSD's DB. >>> >>> >>> On 4/2/2020 4:13 PM, Jack wrote: >>>> (fsck / quick-fix, same story) >>>> >>>> On 4/2/20 3:12 PM, Jack wrote: >>>>> Hi, >>>>> >>>>> A simple fsck eats the same amount of memory >>>>> >>>>> Cluster usage: rbd with a bit of rgw >>>>> >>>>> Here is the ceph df detail >>>>> All OSDs are single rusty devices >>>>> >>>>> On 4/2/20 2:19 PM, Igor Fedotov wrote: >>>>>> Hi Jack, >>>>>> >>>>>> could you please try the following - stop one of already converted >>>>>> OSDs >>>>>> and do a quick-fix/fsck/repair against it using ceph_bluestore_tool: >>>>>> >>>>>> ceph-bluestore-tool --path <path to osd> --command >>>>>> quick-fix|fsck|repair >>>>>> >>>>>> Does it cause similar memory usage? >>>>>> >>>>>> You can stop experimenting if quick-fix reproduces the issue. >>>>>> >>>>>> >>>>>> Also could you please describe your cluster and its usage a bit: >>>>>> what's >>>>>> the usage: rgw/rbd/cephfs? If possible - please share 'ceph df >>>>>> detail' >>>>>> output, do you have standalone DB volume at SSD/NVMe? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Igor >>>>>> >>>>>> >>>>>> On 4/1/2020 6:28 PM, Jack wrote: >>>>>>> Hi, >>>>>>> >>>>>>> As the upgrade documentation tells: >>>>>>>> Note that the first time each OSD starts, it will do a format >>>>>>>> conversion to improve the accounting for “omap” data. This may >>>>>>>> take a few minutes to as much as a few hours (for an HDD with lots >>>>>>>> of omap data). You can disable this automatic conversion with: >>>>>>> What the documentation does not say is that this process takes a >>>>>>> lot of >>>>>>> memory >>>>>>> >>>>>>> I am upgrading a rusty cluster from Nautilus, you can check out the >>>>>>> ram >>>>>>> consumption as attachment >>>>>>> >>>>>>> First, we have a 3TB osd conversion: it tooks ~15min, and 19GB of >>>>>>> memory >>>>>>> >>>>>>> Then, we have a larger 6TB osd conversion: it tooks more than 2 >>>>>>> hours, >>>>>>> and 35GB of memory >>>>>>> >>>>>>> Finally, you have the largest 10TB osd: only 1H15, but 52GB of >>>>>>> memory >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx