Re: 16.2.10: ceph osd perf always shows high latency for a specific OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Zakhar,

I can back up what Konstantin has reported -- we occasionally have
HDDs performing very slowly even though all smart tests come back
clean. Besides ceph osd perf showing a high latency, you could see
high ioutil% with iostat.

We normally replace those HDDs -- usually by draining and zeroing
them, then putting them back in prod (e.g. in a different cluster or
some other service). I don't have statistics on how often those sick
drives come back to full performance or not -- that could indicate it
was a poor physical connection, vibrations, ... , for example. But I
do recall some drives came back repeatedly as "sick" but not dead w/
clean SMART tests.

If you have time you can dig deeper with increased bluestore debug
levels. In our environment this happens often enough that we simply
drain, replace, move on.

Cheers, dan




On Fri, Oct 7, 2022 at 9:41 AM Zakhar Kirpichenko <zakhar@xxxxxxxxx> wrote:
>
> Unfortunately, that isn't the case: the drive is perfectly healthy and,
> according to all measurements I did on the host itself, it isn't any
> different from any other drive on that host size-, health- or
> performance-wise.
>
> The only difference I noticed is that this drive sporadically does more I/O
> than other drives for a split second, probably due to specific PGs placed
> on its OSD, but the average I/O pattern is very similar to other drives and
> OSDs, so it's somewhat unclear why the specific OSD is consistently showing
> much higher latency. It would be good to figure out what exactly is causing
> these I/O spikes, but I'm not yet sure how to do that.
>
> /Z
>
> On Fri, 7 Oct 2022 at 09:24, Konstantin Shalygin <k0ste@xxxxxxxx> wrote:
>
> > Hi,
> >
> > When you see one of 100 drives perf is unusually different, this may mean
> > 'this drive is not like the others' and should be replaced
> >
> >
> > k
> >
> > Sent from my iPhone
> >
> > > On 7 Oct 2022, at 07:33, Zakhar Kirpichenko <zakhar@xxxxxxxxx> wrote:
> > >
> > > Anyone, please?
> >
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux