Hello, Yes we got several slow ops stocks for many seconds. What we noted : CPU/MeM usage less than Nautilus ( https://drive.google.com/file/d/1NGa5sA8dlQ65ld196Ku2hm_Y0xxvfvNs/view?usp=sharingt ) Same behaviour than you . For the moment, the rebuild of one our node seems to fix the latency issue for it. Exemple Disk write request avg waiting time ( HDD) Nautilus : 8-11ms Pacific before rebuild : 29-46ms Pacific after rebuild : 4-5ms disk Average queue size Nautilus : 3-5ms Pacific before rebuild : 6-10ms Pacific after rebuild : 1-2ms *As a part of this upgrade, did you also migrate the OSDs to sharded rocksdb column families? This would have been done by setting bluestore's "quick fix on mount" setting to true or by issuing a "ceph-bluestore-tool repair" offline, perhaps in response to a BLUESTORE_NO_PER_POOL_OMAP warning post-upgrade* *=> * I'm going to let my colleague answer parts of that(he will probably answer tomorrow) Regards, Le lun. 16 mai 2022 à 17:20, Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx> a écrit : > In our case it appears that file deletes have a very high impact on osd > operations. Not a significant delete either ~20T on a 1PB utilized > filesystem (large files as well). > > We are trying to tune down cephfs delayed deletes via: > "mds_max_purge_ops": "512", > "mds_max_purge_ops_per_pg": "0.100000", > > with some success but still experimenting with how we can reduce the > throughput impact from osd slow ops. > > Respectfully, > > *Wes Dillingham* > wes@xxxxxxxxxxxxxxxxx > LinkedIn <http://www.linkedin.com/in/wesleydillingham> > > > On Mon, May 16, 2022 at 9:49 AM Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx> > wrote: > >> We have a newly-built pacific (16.2.7) cluster running 8+3 EC jerasure >> ~250 OSDS across 21 hosts which has significantly lower than expected IOPS. >> Only doing about 30 IOPS per spinning disk (with appropriately sized SSD >> bluestore db) around ~100 PGs per OSD. Have around 100 CephFS (ceph fuse >> 16.2.7) clients using the cluster. Cluster regularly reports slow ops from >> the OSDs but the vast majority, 90% plus of the OSDs, are only <50% IOPS >> utilized. Plenty of cpu/ram/network left on all cluster nodes. We have >> looked for hardware (disk/bond/network/mce) issues across the cluster with >> no findings / checked send-qs and received-q's across the cluster to try >> and narrow in on an individual failing component but nothing found there. >> Slow ops are also spread equally across the servers in the cluster. Does >> your cluster report any health warnings (slow ops etc) alongside your >> reduced performance? >> >> Respectfully, >> >> *Wes Dillingham* >> wes@xxxxxxxxxxxxxxxxx >> LinkedIn <http://www.linkedin.com/in/wesleydillingham> >> >> >> On Mon, May 16, 2022 at 2:00 AM Martin Verges <martin.verges@xxxxxxxx> >> wrote: >> >>> Hello, >>> >>> depending on your workload, drives and OSD allocation size, using the 3+2 >>> can be way slower than the 4+2. Maybe give it a small benchmark and try >>> if >>> you see a huge difference. We had some benchmarks with such and they >>> showed >>> quite ugly results in some tests. Best way to deploy EC in our findings >>> is >>> in power of 2, like 2+x, 4+x, 8+x, 16+x. Especially when you deploy OSDs >>> before the Ceph allocation change patch, you might end up consuming way >>> more space if you don't use power of 2. With the 4k allocation size at >>> least this has been greatly improved for newer deployed OSDs. >>> >>> -- >>> Martin Verges >>> Managing director >>> >>> Mobile: +49 174 9335695 | Chat: https://t.me/MartinVerges >>> >>> croit GmbH, Freseniusstr. 31h, 81247 Munich >>> CEO: Martin Verges - VAT-ID: DE310638492 >>> Com. register: Amtsgericht Munich HRB 231263 >>> Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx >>> >>> >>> On Sun, 15 May 2022 at 20:30, stéphane chalansonnet <schalans@xxxxxxxxx> >>> wrote: >>> >>> > Hi, >>> > >>> > Thank you for your answer. >>> > this is not a good news if you also notice a performance decrease on >>> your >>> > side >>> > No, as far as we know, you cannot downgrade to Octopus. >>> > Going forward seems to be the only way, so Quincy . >>> > We have a a qualification cluster so we can try on it (but full virtual >>> > configuration) >>> > >>> > >>> > We are using 4+2 and 3+2 profile >>> > Are you also on the same profile on your Cluster ? >>> > Maybe replicated profile are not be impacted ? >>> > >>> > Actually, we are trying to recreate one by one the OSD. >>> > some parameters can be only set by this way . >>> > The first storage Node is almost rebuild, we will see if the latencies >>> on >>> > it are below the others ... >>> > >>> > Wait and see ..... >>> > >>> > Le dim. 15 mai 2022 à 10:16, Martin Verges <martin.verges@xxxxxxxx> a >>> > écrit : >>> > >>> >> Hello, >>> >> >>> >> what exact EC level do you use? >>> >> >>> >> I can confirm, that our internal data shows a performance drop when >>> using >>> >> pacific. So far Octopus is faster and better than pacific but I doubt >>> you >>> >> can roll back to it. We haven't rerun our benchmarks on Quincy yet, >>> but >>> >> according to some presentation it should be faster than pacific. >>> Maybe try >>> >> to jump away from the pacific release into the unknown! >>> >> >>> >> -- >>> >> Martin Verges >>> >> Managing director >>> >> >>> >> Mobile: +49 174 9335695 | Chat: https://t.me/MartinVerges >>> >> >>> >> croit GmbH, Freseniusstr. 31h, 81247 Munich >>> >> CEO: Martin Verges - VAT-ID: DE310638492 >>> >> Com. register: Amtsgericht Munich HRB 231263 >>> >> Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx >>> >> >>> >> >>> >> On Sat, 14 May 2022 at 12:27, stéphane chalansonnet < >>> schalans@xxxxxxxxx> >>> >> wrote: >>> >> >>> >>> Hello, >>> >>> >>> >>> After a successful update from Nautilus to Pacific on Centos8.5, we >>> >>> observed some high latencies on our cluster. >>> >>> >>> >>> We did not find very much thing on community related to latencies >>> post >>> >>> migration >>> >>> >>> >>> Our setup is >>> >>> 6x storage Node (256GRAM, 2SSD OSD + 5*6To SATA HDD) >>> >>> Erasure coding profile >>> >>> We have two EC pool : >>> >>> -> Pool1 : Full HDD SAS Drive 6To >>> >>> -> Pool2 : Full SSD Drive >>> >>> >>> >>> Object S3 and RBD block workload >>> >>> >>> >>> Our performances in nautilus, before the upgrade , are acceptable. >>> >>> However , the next day , performance dropped by 3 or 4 >>> >>> Benchmark showed 15KIOPS on flash drive , before upgrade we had >>> >>> almost 80KIOPS >>> >>> Also, HDD pool is almost down (too much lantencies >>> >>> >>> >>> We suspected , maybe, an impact on erasure Coding configuration on >>> >>> Pacific >>> >>> Anyone observed the same behaviour ? any tuning ? >>> >>> >>> >>> Thank you for your help. >>> >>> >>> >>> ceph osd tree >>> >>> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT >>> >>> PRI-AFF >>> >>> -1 347.61304 root default >>> >>> -3 56.71570 host cnp31tcephosd01 >>> >>> 0 hdd 5.63399 osd.0 up 1.00000 >>> >>> 1.00000 >>> >>> 1 hdd 5.63399 osd.1 up 1.00000 >>> >>> 1.00000 >>> >>> 2 hdd 5.63399 osd.2 up 1.00000 >>> >>> 1.00000 >>> >>> 3 hdd 5.63399 osd.3 up 1.00000 >>> >>> 1.00000 >>> >>> 4 hdd 5.63399 osd.4 up 1.00000 >>> >>> 1.00000 >>> >>> 5 hdd 5.63399 osd.5 up 1.00000 >>> >>> 1.00000 >>> >>> 6 hdd 5.63399 osd.6 up 1.00000 >>> >>> 1.00000 >>> >>> 7 hdd 5.63399 osd.7 up 1.00000 >>> >>> 1.00000 >>> >>> 40 ssd 5.82190 osd.40 up 1.00000 >>> >>> 1.00000 >>> >>> 48 ssd 5.82190 osd.48 up 1.00000 >>> >>> 1.00000 >>> >>> -5 56.71570 host cnp31tcephosd02 >>> >>> 8 hdd 5.63399 osd.8 up 1.00000 >>> >>> 1.00000 >>> >>> 9 hdd 5.63399 osd.9 down 1.00000 >>> >>> 1.00000 >>> >>> 10 hdd 5.63399 osd.10 up 1.00000 >>> >>> 1.00000 >>> >>> 11 hdd 5.63399 osd.11 up 1.00000 >>> >>> 1.00000 >>> >>> 12 hdd 5.63399 osd.12 up 1.00000 >>> >>> 1.00000 >>> >>> 13 hdd 5.63399 osd.13 up 1.00000 >>> >>> 1.00000 >>> >>> 14 hdd 5.63399 osd.14 up 1.00000 >>> >>> 1.00000 >>> >>> 15 hdd 5.63399 osd.15 up 1.00000 >>> >>> 1.00000 >>> >>> 49 ssd 5.82190 osd.49 up 1.00000 >>> >>> 1.00000 >>> >>> 50 ssd 5.82190 osd.50 up 1.00000 >>> >>> 1.00000 >>> >>> -7 56.71570 host cnp31tcephosd03 >>> >>> 16 hdd 5.63399 osd.16 up 1.00000 >>> >>> 1.00000 >>> >>> 17 hdd 5.63399 osd.17 up 1.00000 >>> >>> 1.00000 >>> >>> 18 hdd 5.63399 osd.18 up 1.00000 >>> >>> 1.00000 >>> >>> 19 hdd 5.63399 osd.19 up 1.00000 >>> >>> 1.00000 >>> >>> 20 hdd 5.63399 osd.20 up 1.00000 >>> >>> 1.00000 >>> >>> 21 hdd 5.63399 osd.21 up 1.00000 >>> >>> 1.00000 >>> >>> 22 hdd 5.63399 osd.22 up 1.00000 >>> >>> 1.00000 >>> >>> 23 hdd 5.63399 osd.23 up 1.00000 >>> >>> 1.00000 >>> >>> 51 ssd 5.82190 osd.51 up 1.00000 >>> >>> 1.00000 >>> >>> 52 ssd 5.82190 osd.52 up 1.00000 >>> >>> 1.00000 >>> >>> -9 56.71570 host cnp31tcephosd04 >>> >>> 24 hdd 5.63399 osd.24 up 1.00000 >>> >>> 1.00000 >>> >>> 25 hdd 5.63399 osd.25 up 1.00000 >>> >>> 1.00000 >>> >>> 26 hdd 5.63399 osd.26 up 1.00000 >>> >>> 1.00000 >>> >>> 27 hdd 5.63399 osd.27 up 1.00000 >>> >>> 1.00000 >>> >>> 28 hdd 5.63399 osd.28 up 1.00000 >>> >>> 1.00000 >>> >>> 29 hdd 5.63399 osd.29 up 1.00000 >>> >>> 1.00000 >>> >>> 30 hdd 5.63399 osd.30 up 1.00000 >>> >>> 1.00000 >>> >>> 31 hdd 5.63399 osd.31 up 1.00000 >>> >>> 1.00000 >>> >>> 53 ssd 5.82190 osd.53 up 1.00000 >>> >>> 1.00000 >>> >>> 54 ssd 5.82190 osd.54 up 1.00000 >>> >>> 1.00000 >>> >>> -11 56.71570 host cnp31tcephosd05 >>> >>> 32 hdd 5.63399 osd.32 up 1.00000 >>> >>> 1.00000 >>> >>> 33 hdd 5.63399 osd.33 up 1.00000 >>> >>> 1.00000 >>> >>> 34 hdd 5.63399 osd.34 up 1.00000 >>> >>> 1.00000 >>> >>> 35 hdd 5.63399 osd.35 up 1.00000 >>> >>> 1.00000 >>> >>> 36 hdd 5.63399 osd.36 up 1.00000 >>> >>> 1.00000 >>> >>> 37 hdd 5.63399 osd.37 up 1.00000 >>> >>> 1.00000 >>> >>> 38 hdd 5.63399 osd.38 up 1.00000 >>> >>> 1.00000 >>> >>> 39 hdd 5.63399 osd.39 up 1.00000 >>> >>> 1.00000 >>> >>> 55 ssd 5.82190 osd.55 up 1.00000 >>> >>> 1.00000 >>> >>> 56 ssd 5.82190 osd.56 up 1.00000 >>> >>> 1.00000 >>> >>> -13 64.03453 host cnp31tcephosd06 >>> >>> 41 hdd 7.48439 osd.41 up 1.00000 >>> >>> 1.00000 >>> >>> 42 hdd 7.48439 osd.42 up 1.00000 >>> >>> 1.00000 >>> >>> 43 hdd 7.48439 osd.43 up 1.00000 >>> >>> 1.00000 >>> >>> 44 hdd 7.48439 osd.44 up 1.00000 >>> >>> 1.00000 >>> >>> 45 hdd 7.48439 osd.45 up 1.00000 >>> >>> 1.00000 >>> >>> 46 hdd 7.48439 osd.46 up 1.00000 >>> >>> 1.00000 >>> >>> 47 hdd 7.48439 osd.47 up 1.00000 >>> >>> 1.00000 >>> >>> 57 ssd 5.82190 osd.57 up 1.00000 >>> >>> 1.00000 >>> >>> 58 ssd 5.82190 osd.58 up 1.00000 >>> >>> 1.00000 >>> >>> _______________________________________________ >>> >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> >>> >>> >> >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> >> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx