Thanks for your observation. Indeed, I do not get drop of performance when upgrading from Nautilus to Octopus. But even using Pacific 16.1.0, the performance just goes down, so I guess we run into the same issue somehow. I do not think just staying in Octopus is a solution, as it will reach EOF eventually. The source of this performance drop is still a mystery to me. Luis Domingues ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Tuesday, September 7th, 2021 at 10:51 AM, Martin Mlynář <nextsux@xxxxxxxxx> wrote: > Hello, > > we've noticed similar issue after upgrading our test 3 node cluster from > > 15.2.14-1~bpo10+1 to 16.1.0-1~bpo10+1. > > quick tests using rados bench: > > 16.2.5-1~bpo10+1: > > Total time run: 133.28 > > Total writes made: 576 > > Write size: 4194304 > > Object size: 4194304 > > Bandwidth (MB/sec): 17.2869 > > Stddev Bandwidth: 34.1485 > > Max bandwidth (MB/sec): 204 > > Min bandwidth (MB/sec): 0 > > Average IOPS: 4 > > Stddev IOPS: 8.55426 > > Max IOPS: 51 > > Min IOPS: 0 > > Average Latency(s): 3.59873 > > Stddev Latency(s): 5.99964 > > Max latency(s): 30.6307 > > Min latency(s): 0.0865062 > > after downgrading OSDs: > > 15.2.14-1~bpo10+1: > > Total time run: 120.135 > > Total writes made: 16324 > > Write size: 4194304 > > Object size: 4194304 > > Bandwidth (MB/sec): 543.524 > > Stddev Bandwidth: 21.7548 > > Max bandwidth (MB/sec): 580 > > Min bandwidth (MB/sec): 436 > > Average IOPS: 135 > > Stddev IOPS: 5.43871 > > Max IOPS: 145 > > Min IOPS: 109 > > Average Latency(s): 0.117646 > > Stddev Latency(s): 0.0391269 > > Max latency(s): 0.544229 > > Min latency(s): 0.0602735 > > We currently run on this setup: > > { > > "mon": { > > "ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) > > pacific (stable)": 2 > > }, > > "mgr": { > > "ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) > > octopus (stable)": 3 > > }, > > "osd": { > > "ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) > > octopus (stable)": 35 > > }, > > "mds": {}, > > "overall": { > > "ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) > > octopus (stable)": 38, > > "ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) > > pacific (stable)": 2 > > } > > } > > which solved performance issues. All OSDs were newly created and fully > > synced from other nodes when upgrading and downgrading back to 15.2. > > Best Regards, > > Martin > > Dne 05. 09. 21 v 19:45 Luis Domingues napsal(a): > > > Hello, > > > > I run a test cluster of 3 machines with 24 HDDs each, running bare-metal on CentOS 8. Long story short, I can have a bandwidth of ~ 1'200 MB/s when I do a rados bench, writing objects of 128k, when the cluster is installed with Nautilus. > > > > When I upgrade the cluster to Pacific, (using ceph-ansible to deploy and/or upgrade), my performances drop to ~400 MB/s of bandwidth doing the same rados bench. > > > > I am kind of clueless on what makes the performance drop so much. Does someone have some ideas where I can dig to find the root of this difference? > > > > Thanks, > > > > Luis Domingues > > Martin Mlynář > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx