Re: Ceph 16.2.12, particular OSD shows higher latency than others

Eugen Block <eblock@xxxxxx> · Thu, 27 Apr 2023 08:24:09 +0000

Those numbers look really high to me, more than 2 seconds for a write  
is awful. Is this a HDD-only cluster/pool? But even then it would be  
too high, I just compared with our HDD-backed cluster (although  
rocksDB is SSD-backed) which also mainly serves RBD to openstack. What  
is the general utilization of that host? Is it an upgraded cluster  
which could suffer from the performance degredation which was  
discussed in a recent thread? But I'd expect that more OSDs would be  
affected by that. How many PGs and objects are on that OSD (ceph pg  
ls-by-osd <ID>)? Have you tried to restart and/or compact the OSD and  
see if anything improves?
You could set its primary-affinity to 0, or the worst case rebuild  
that OSD. And there are no smart errors or anything in dmesg reported  
about this disk?

Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>:

Thanks, Eugen!

It's a bunch of entries like this https://pastebin.com/TGPu6PAT - I'm not
really sure what to make of them. I checked adjacent OSDs and they have
similar ops, but aren't showing excessive latency.

/Z

On Thu, 27 Apr 2023 at 10:42, Eugen Block <eblock@xxxxxx> wrote:

Hi,

I would monitor the historic_ops_by_duration for a while and see if
any specific operation takes unusually long.

# this is within the container
[ceph: root@storage01 /]# ceph daemon osd.0 dump_historic_ops_by_duration
| head
{
     "size": 20,
     "duration": 600,
     "ops": [
         {
             "description": "osd_repop(client.9384193.0:2056545 12.6
e2233/2221 12:6192870f:::obj_delete_at_hint.0000000053:head v
2233'696390, mlcod=2233'696388)",
             "initiated_at": "2023-04-27T07:37:35.046036+0000",
             "age": 54.805016199999997,
             "duration": 0.58198468699999995,
...

The output contains the PG (so you know which pool is involved) and
the duration of the operation, not sure if that helps though.

Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>:

> As suggested by someone, I tried `dump_historic_slow_ops`. There aren't
> many, and they're somewhat difficult to interpret:
>
>             "description": "osd_op(client.250533532.0:56821 13.16f
> 13:f6c9079e:::rbd_data.eed629ecc1f946.000000000000001c:head [stat,write
> 3518464~8192] snapc 0=[] ondisk+write+known_if_redirected e118835)",
>             "initiated_at": "2023-04-26T07:00:58.299120+0000",
>             "description": "osd_op(client.250533532.0:56822 13.16f
> 13:f6c9079e:::rbd_data.eed629ecc1f946.000000000000001c:head [stat,write
> 3559424~4096] snapc 0=[] ondisk+write+known_if_redirected e118835)",
>             "initiated_at": "2023-04-26T07:00:58.299132+0000",
>             "description": "osd_op(client.250533532.0:56823 13.16f
> 13:f6c9079e:::rbd_data.eed629ecc1f946.000000000000001c:head [stat,write
> 3682304~4096] snapc 0=[] ondisk+write+known_if_redirected e118835)",
>             "initiated_at": "2023-04-26T07:00:58.299138+0000",
>             "description": "osd_op(client.250533532.0:56824 13.16f
> 13:f6c9079e:::rbd_data.eed629ecc1f946.000000000000001c:head [stat,write
> 3772416~4096] snapc 0=[] ondisk+write+known_if_redirected e118835)",
>             "initiated_at": "2023-04-26T07:00:58.299148+0000",
>             "description": "osd_op(client.250533532.0:56825 13.16f
> 13:f6c9079e:::rbd_data.eed629ecc1f946.000000000000001c:head [stat,write
> 3796992~8192] snapc 0=[] ondisk+write+known_if_redirected e118835)",
>             "initiated_at": "2023-04-26T07:00:58.299188+0000",
>             "description": "osd_op(client.250533532.0:56826 13.16f
> 13:f6c9079e:::rbd_data.eed629ecc1f946.000000000000001c:head [stat,write
> 3862528~8192] snapc 0=[] ondisk+write+known_if_redirected e118835)",
>             "initiated_at": "2023-04-26T07:00:58.299198+0000",
>             "description": "osd_op(client.250533532.0:56827 13.16f
> 13:f6c9079e:::rbd_data.eed629ecc1f946.000000000000001c:head [stat,write
> 3899392~12288] snapc 0=[] ondisk+write+known_if_redirected e118835)",
>             "initiated_at": "2023-04-26T07:00:58.299207+0000",
>             "description": "osd_op(client.250533532.0:56828 13.16f
> 13:f6c9079e:::rbd_data.eed629ecc1f946.000000000000001c:head [stat,write
> 3944448~16384] snapc 0=[] ondisk+write+known_if_redirected e118835)",
>             "initiated_at": "2023-04-26T07:00:58.299250+0000",
>             "description": "osd_op(client.250533532.0:56829 13.16f
> 13:f6c9079e:::rbd_data.eed629ecc1f946.000000000000001c:head [stat,write
> 4018176~4096] snapc 0=[] ondisk+write+known_if_redirected e118835)",
>             "initiated_at": "2023-04-26T07:00:58.299270+0000",
>
> There's a lot more information there ofc. I also tried to
> `dump_ops_in_flight` and there aren't many, usually 0-10 ops at a time,
but
> the OSD latency remains high even when the ops count is low or zero. Any
> ideas?
>
> I would very much appreciate it if some could please point me to the
> documentation on interpreting the output of ops dump.
>
> /Z
>
>
> On Wed, 26 Apr 2023 at 20:22, Zakhar Kirpichenko <zakhar@xxxxxxxxx>
wrote:
>
>> Hi,
>>
>> I have a Ceph 16.2.12 cluster with uniform hardware, same drive
>> make/model, etc. A particular OSD is showing higher latency than usual
in
>> `ceph osd perf`, usually mid to high tens of milliseconds while other
OSDs
>> show low single digits, although its drive's I/O stats don't look
different
>> from those of other drives. The workload is mainly random 4K reads and
>> writes, the cluster is being used as Openstack VM storage.
>>
>> Is there a way to trace, which particular PG, pool and disk image or
>> object cause this OSD's excessive latency? Is there a way to tell Ceph
to
>>
>> I would appreciate any advice or pointers.
>>
>> Best regards,
>> Zakhar
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx