Re: deep-scrub / backfilling: large amount of SLOW_OPS after upgrade to 13.2.8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Quoting Stefan Kooman (stefan@xxxxxx):
> Hi,
> 
> After the upgrade to 13.2.8 deep-scrub has a big impact on client IO:
> loads of SLOW_OPS and high latency. We hardly ever had SLOW_OPS, but
> since the upgrade the impact is so big that we even have OSDs marking
> each other out (OSD op thread timeout) multiple times during the scrub
> window. Plenty of CPU / RAM / IOPS left, hardly any load on these OSD
> servers. Has there anything changed in this release that can explain
> this behaviour?
> 
> Besides this the impact of rebalance is very severe as well. With only
> the balancer remapping a couple of PGs at a time there are loads of
> (MDS_)SLOW_OPS. This morning the cephfs metadata pool got rebalanced ...
> and that triggered a lot of SLOW_OPS. One particular OSD was pegged at
> 1000% CPU for more than half an hour (not doing that much IO): that's 10
> cores going full throttle! After a restart this issue was gone.

We can now also trigger SLOW_OPS on a bunch of OSDs when we do a "rbd du
-p $POOL", something that has never been an issue. The images in
the rbd pools have the following features enabled: layering,
exclusive-lock, object-map, fast-diff, deep-flatten.

Has there anything changed in 13.2.8 that affects these kind of
operations?

Gr. Stefan



-- 
| BIT BV  https://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / info@xxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux