Re: Migration Nautilus to Pacifi : Very high latencies (EC profile)

David Orman <ormandj@xxxxxxxxxxxx> · Tue, 17 May 2022 09:23:23 -0500

We had an issue with our original fix in 45963 which was resolved in
https://github.com/ceph/ceph/pull/46096. It includes the fix as well as
handling for upgraded clusters. This is in the 16.2.8 release. I'm not sure
if it will resolve your problem (or help mitigate it) but it would be worth
trying.

Head's up on 16.2.8 though, see the release thread, we ran into an issue
with it on our larger clusters: https://tracker.ceph.com/issues/55687

On Tue, May 17, 2022 at 3:44 AM BEAUDICHON Hubert (Acoss) <
hubert.beaudichon@xxxxxxxx> wrote:

> Hi Josh,
>
> I'm working with Stéphane and I'm the "ceph admin" (big words ^^) in our
> team.
> So, yes, as part of the upgrade we've done the offline repair to split the
> omap by pool.
> The quick fix is, as far as I know, still disable on the default
> properties.
>
> On the I/O and CPU load, between Nautilus and Pacific, we haven't seen a
> really big change, just an increase in disk latency and in the end, the
> "ceph read operation" metric drop from 20K to 5K or less.
>
> But yes, a lot of slow IOPs were emerging as time passed.
>
> At this time, we have completely out one of our data node, and recreate
> from scratch 5 of 8 OSD deamons (DB on SSD, data on spinning drive).
> The result seems very good at this moment (we're seeing better metrics
> than under Nautilus).
>
> Since recreation, I have change 3 parameters :
> bdev_async_discard => osd : true
> bdev_enable_discard => osd : true
> bdev_aio_max_queue_depth => osd: 8192
>
> The first two have been extremely helpful for our SSD Pool, even with
> enterprise grade SSD, the "trim" seems to have rejuvenate our pool.
> The last one was set in response of messages in the newly create OSD :
> "bdev(0x55588e220400 <path to block>) aio_submit retries XX"
> After changing it and restarting the OSD process, messages were gone, and
> it seems to have a beneficial effect on our data node.
>
> I've seen that the 16.2.8 was out yesterday, but I'm a little confused on :
> [Revert] bluestore: set upper and lower bounds on rocksdb omap iterators
> (pr#46092, Neha Ojha)
> bluestore: set upper and lower bounds on rocksdb omap iterators (pr#45963,
> Cory Snyder)
>
> (theses two lines seems related to https://tracker.ceph.com/issues/55324).
>
> One step forward, one step backward ?
>
> Hubert Beaudichon
>
>
> -----Message d'origine-----
> De : Josh Baergen <jbaergen@xxxxxxxxxxxxxxxx>
> Envoyé : lundi 16 mai 2022 16:56
> À : stéphane chalansonnet <schalans@xxxxxxxxx>
> Cc : ceph-users@xxxxxxx
> Objet :  Re: Migration Nautilus to Pacifi : Very high
> latencies (EC profile)
>
> Hi Stéphane,
>
> On Sat, May 14, 2022 at 4:27 AM stéphane chalansonnet <schalans@xxxxxxxxx>
> wrote:
> > After a successful update from Nautilus to Pacific on Centos8.5, we
> > observed some high latencies on our cluster.
>
> As a part of this upgrade, did you also migrate the OSDs to sharded
> rocksdb column families? This would have been done by setting bluestore's
> "quick fix on mount" setting to true or by issuing a "ceph-bluestore-tool
> repair" offline, perhaps in response to a BLUESTORE_NO_PER_POOL_OMAP
> warning post-upgrade.
>
> I ask because I'm wondering if you're hitting
> https://tracker.ceph.com/issues/55324, for which there is a fix coming in
> 16.2.8. If you inspect the nodes and disks involved in your EC pool, are
> you seeing high read or write I/O? High CPU usage?
>
> Josh
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx