What was the largest cluster that you upgraded that didn't exhibit the new issue in 16.2.8 ? Thanks. Respectfully, *Wes Dillingham* wes@xxxxxxxxxxxxxxxxx LinkedIn <http://www.linkedin.com/in/wesleydillingham> On Tue, May 17, 2022 at 10:24 AM David Orman <ormandj@xxxxxxxxxxxx> wrote: > We had an issue with our original fix in 45963 which was resolved in > https://github.com/ceph/ceph/pull/46096. It includes the fix as well as > handling for upgraded clusters. This is in the 16.2.8 release. I'm not sure > if it will resolve your problem (or help mitigate it) but it would be worth > trying. > > Head's up on 16.2.8 though, see the release thread, we ran into an issue > with it on our larger clusters: https://tracker.ceph.com/issues/55687 > > On Tue, May 17, 2022 at 3:44 AM BEAUDICHON Hubert (Acoss) < > hubert.beaudichon@xxxxxxxx> wrote: > > > Hi Josh, > > > > I'm working with Stéphane and I'm the "ceph admin" (big words ^^) in our > > team. > > So, yes, as part of the upgrade we've done the offline repair to split > the > > omap by pool. > > The quick fix is, as far as I know, still disable on the default > > properties. > > > > On the I/O and CPU load, between Nautilus and Pacific, we haven't seen a > > really big change, just an increase in disk latency and in the end, the > > "ceph read operation" metric drop from 20K to 5K or less. > > > > But yes, a lot of slow IOPs were emerging as time passed. > > > > At this time, we have completely out one of our data node, and recreate > > from scratch 5 of 8 OSD deamons (DB on SSD, data on spinning drive). > > The result seems very good at this moment (we're seeing better metrics > > than under Nautilus). > > > > Since recreation, I have change 3 parameters : > > bdev_async_discard => osd : true > > bdev_enable_discard => osd : true > > bdev_aio_max_queue_depth => osd: 8192 > > > > The first two have been extremely helpful for our SSD Pool, even with > > enterprise grade SSD, the "trim" seems to have rejuvenate our pool. > > The last one was set in response of messages in the newly create OSD : > > "bdev(0x55588e220400 <path to block>) aio_submit retries XX" > > After changing it and restarting the OSD process, messages were gone, and > > it seems to have a beneficial effect on our data node. > > > > I've seen that the 16.2.8 was out yesterday, but I'm a little confused > on : > > [Revert] bluestore: set upper and lower bounds on rocksdb omap iterators > > (pr#46092, Neha Ojha) > > bluestore: set upper and lower bounds on rocksdb omap iterators > (pr#45963, > > Cory Snyder) > > > > (theses two lines seems related to https://tracker.ceph.com/issues/55324 > ). > > > > One step forward, one step backward ? > > > > Hubert Beaudichon > > > > > > -----Message d'origine----- > > De : Josh Baergen <jbaergen@xxxxxxxxxxxxxxxx> > > Envoyé : lundi 16 mai 2022 16:56 > > À : stéphane chalansonnet <schalans@xxxxxxxxx> > > Cc : ceph-users@xxxxxxx > > Objet : Re: Migration Nautilus to Pacifi : Very high > > latencies (EC profile) > > > > Hi Stéphane, > > > > On Sat, May 14, 2022 at 4:27 AM stéphane chalansonnet < > schalans@xxxxxxxxx> > > wrote: > > > After a successful update from Nautilus to Pacific on Centos8.5, we > > > observed some high latencies on our cluster. > > > > As a part of this upgrade, did you also migrate the OSDs to sharded > > rocksdb column families? This would have been done by setting bluestore's > > "quick fix on mount" setting to true or by issuing a "ceph-bluestore-tool > > repair" offline, perhaps in response to a BLUESTORE_NO_PER_POOL_OMAP > > warning post-upgrade. > > > > I ask because I'm wondering if you're hitting > > https://tracker.ceph.com/issues/55324, for which there is a fix coming > in > > 16.2.8. If you inspect the nodes and disks involved in your EC pool, are > > you seeing high read or write I/O? High CPU usage? > > > > Josh > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an > > email to ceph-users-leave@xxxxxxx > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx