Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 07, 2022 at 11:13:49AM +0100, Boris Behrens wrote:
> Hi Sven,
> thanks for the input.
> 
> So I did some testing and "maybe" optimization.
> The same disk type in two different hosts (one Ubuntu and one Centos7) have
> VERY different iostat %util values:

I guess Centos7 has a rather old kernel. What are the kernel versions on
these hosts?

I have seen a drastic increase in iostat %util numbers on a Ceph cluster
on Ubuntu hosts, after an Ubuntu upgrade 18.04 => 20.04 => 22.04
(upgrading Ceph along with it).  iostat %util was up high since, but
iostat latency values dropped considerably. As the the cluster seemed
slightly faster overall after these upgrades, I did not worry much about
increased %util numbers.

I can't tell if this showed up already in the short time on 20.04. Did
not investigate further, just shrugged it off as probably some change in
the kernel disk subsystem and/or cpu scheduling, or in the way these
measurements are taken. The usefulness of %util is limited anyway.

Matthias


> Ubuntu:
> Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s
>   wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm
> d_await dareq-sz  aqu-sz  %util
> sdh             45.60   2419.20    20.40  30.91    1.03    53.05 1949.40
>  12928.80   516.00  20.93    0.10     6.63    0.00      0.00     0.00
> 0.00    0.00     0.00    0.02  66.00
> 
> Centos:
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sdy               6,40   352,80   30,60 1264,40  1705,60 11061,60    19,72
>     0,24    0,18    2,39    0,13   0,08   9,82
> 
> What I tried on the Ubuntu Host:
> - disable write cache (no difference)
> - set the scheduler from mq-deadline to none (ubuntu seems to not have
> noop, or it might be the same) (no difference)
> - set tune-adm to latency=performance (made it slightly better)
> 
> It is really strange to see these high util values only on the ubuntu host.
> 
> 
> 
> 
> Am Di., 6. Dez. 2022 um 19:22 Uhr schrieb Sven Kieske <S.Kieske@xxxxxxxxxxx
> >:
> 
> > On Di, 2022-12-06 at 15:38 +0100, Boris Behrens wrote:
> > > I've cross checked the other 8TB disks in our cluster, which are around
> > > 30-50% with roughly the same IOPs.
> > > Maybe I am missing some optimization, that is done on the centos7 nodes,
> > > but not on the ubuntu20.04 node. (If you know something from the top of
> > > your head, I am happy to hear it).
> > > Maybe it is just another measuring on ubuntu.
> >
> > the very first thing I would check is the drive read/write caches:
> >
> >
> > https://docs.ceph.com/en/quincy/start/hardware-recommendations/#write-caches
> >
> > (this part of the docs also applies to earlier ceph releases, but wasn't
> > available
> > in older releases)
> >
> > I'd recommend installing the udev rule which switches write caches off.
> >
> > You might want to evaluate first if your drives perform better or worse
> > without caches.
> >
> > IIRC there where some reports on this ML that performance was even worse
> > on some drives without the cache for certain workloads.
> >
> > But I never experienced this myself.
> >
> > HTH
> >
> > --
> > Mit freundlichen Grüßen / Regards
> >
> > Sven Kieske
> > Systementwickler / systems engineer
> >
> >
> > Mittwald CM Service GmbH & Co. KG
> > Königsberger Straße 4-6
> > 32339 Espelkamp
> >
> > Tel.: 05772 / 293-900
> > Fax: 05772 / 293-333
> >
> > https://www.mittwald.de
> >
> > Geschäftsführer: Robert Meyer, Florian Jürgens
> >
> > St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
> > Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
> >
> > Informationen zur Datenverarbeitung im Rahmen unserer Geschäftstätigkeit
> > gemäß Art. 13-14 DSGVO sind unter www.mittwald.de/ds abrufbar.
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> 
> 
> -- 
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux