Slow ops incident with 2 Bluestore OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

Tonight we had an incident in our hyperconverged Proxmox cluster with Ceph v16.2.6 .

At 04:56:08 OSD.12 in node1 started to report slow ops.
At 04:56.09 OSD.15 in node4 started to report slow ops.

ID   CLASS  WEIGHT    TYPE NAME             STATUS  REWEIGHT PRI-AFF
 -1         16.98720  root default
-13          3.56488      host node1
  2    hdd   0.90919          osd.2             up   1.00000 1.00000
  3    hdd   0.90919          osd.3             up   1.00000 1.00000
 12    ssd   1.74649          osd.12            up   1.00000 1.00000
-15          4.47406      host node2
  0    hdd   0.90919          osd.0             up   1.00000 1.00000
  4    hdd   0.90918          osd.4             up   1.00000 1.00000
  9    hdd   0.90919          osd.9             up   1.00000 1.00000
 13    ssd   1.74649          osd.13            up   1.00000 1.00000
 -6          4.47408      host node3
  8    hdd   0.90919          osd.8             up   1.00000 1.00000
 10    hdd   0.90919          osd.10            up   1.00000 1.00000
 11    hdd   0.90919          osd.11            up   1.00000 1.00000
 14    ssd   1.74649          osd.14            up   1.00000 1.00000
 -5          4.47418      host node4
  1    hdd   0.90919          osd.1             up   1.00000 1.00000
  6    hdd   0.90919          osd.6             up   1.00000 1.00000
  7    hdd   0.90919          osd.7             up   1.00000 1.00000
 15    ssd   1.74660          osd.15            up   1.00000 1.00000

Ceph network is 10Gbps. We have one SSD replicated pool (size=3,min=2), and one HDD replicated pool (size=3,min=1). Autoscale is in warning mode. Both pools are about 25% occupied.

We can see in node1 network inbound bandwith usage that peaked to about 83 Mb/s stable about that time. We can see in node4 network outbound bandwith usage that peaked to about 85 Mb/s stable about that time.

Issue was resolved restarting OSD.15 . Within about a minute, bandwith went back to normal and slow ops disappeared. Some VMs had to be rebooted, but so far so good :)

It's like OSD.15 was sending data to OSD.12 but it was refused? :-)

Both OSD have a SSD backing disk, wal/db/data is in the same disk. Format is bluestore for all SSD OSDs, and for some of the HDD.

No disk issues are reported in syslog nor dmesg for those two SSDs.

Logs for OSDs:

--- OSD.15 node4
2022-01-12T04:46:40.737+0100 7fd28344a700  4 rocksdb: [db_impl/db_impl.cc:849] ------- DUMPING STATS ------- 2022-01-12T04:46:40.737+0100 7fd28344a700  4 rocksdb: [db_impl/db_impl.cc:851]
[... stripped stats ...]

2022-01-12T04:56:09.491+0100 7fd28e460700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312938.0:57976492 7.3b 7:dcd73c34:::rbd_data.a001436b8b4567.0000000000000062:head [write 1855488~8192 in=8192b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:39.232963+0100 currently waiting for sub ops 2022-01-12T04:56:09.491+0100 7fd28e460700 -1 osd.15 44032 get_health_metrics reporting 1 slow ops, oldest is osd_op(client.612312938.0:57976492 7.3b 7:dcd73c34:::rbd_data.a001436b8b4567.0000000000000062:head [write 1855488~8192 in=8192b] snapc 0=[] ondisk+write+known_if_redirected e44032) 2022-01-12T04:56:10.499+0100 7fd28e460700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312938.0:57976492 7.3b 7:dcd73c34:::rbd_data.a001436b8b4567.0000000000000062:head [write 1855488~8192 in=8192b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:39.232963+0100 currently waiting for sub ops 2022-01-12T04:56:10.499+0100 7fd28e460700 -1 osd.15 44032 get_health_metrics reporting 1 slow ops, oldest is osd_op(client.612312938.0:57976492 7.3b 7:dcd73c34:::rbd_data.a001436b8b4567.0000000000000062:head [write 1855488~8192 in=8192b] snapc 0=[] ondisk+write+known_if_redirected e44032) 2022-01-12T04:56:11.519+0100 7fd28e460700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312938.0:57976492 7.3b 7:dcd73c34:::rbd_data.a001436b8b4567.0000000000000062:head [write 1855488~8192 in=8192b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:39.232963+0100 currently waiting for sub ops 2022-01-12T04:56:11.519+0100 7fd28e460700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312662.0:8092600 7.4d 7:b36b6fe0:::rbd_data.9fe7186b8b4567.0000000000001c22:head [write 2367488~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:40.907230+0100 currently waiting for sub ops 2022-01-12T04:56:11.519+0100 7fd28e460700 -1 osd.15 44032 get_health_metrics reporting 2 slow ops, oldest is osd_op(client.612312938.0:57976492 7.3b 7:dcd73c34:::rbd_data.a001436b8b4567.0000000000000062:head [write 1855488~8192 in=8192b] snapc 0=[] ondisk+write+known_if_redirected e44032) 2022-01-12T04:56:12.515+0100 7fd28e460700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312938.0:57976492 7.3b 7:dcd73c34:::rbd_data.a001436b8b4567.0000000000000062:head [write 1855488~8192 in=8192b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:39.232963+0100 currently waiting for sub ops 2022-01-12T04:56:12.515+0100 7fd28e460700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312662.0:8092600 7.4d 7:b36b6fe0:::rbd_data.9fe7186b8b4567.0000000000001c22:head [write 2367488~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:40.907230+0100 currently waiting for sub ops 2022-01-12T04:56:12.515+0100 7fd28e460700 -1 osd.15 44032 get_health_metrics reporting 2 slow ops, oldest is osd_op(client.612312938.0:57976492 7.3b 7:dcd73c34:::rbd_data.a001436b8b4567.0000000000000062:head [write 1855488~8192 in=8192b] snapc 0=[] ondisk+write+known_if_redirected e44032) 2022-01-12T04:56:13.491+0100 7fd28e460700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312938.0:57976492 7.3b 7:dcd73c34:::rbd_data.a001436b8b4567.0000000000000062:head [write 1855488~8192 in=8192b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:39.232963+0100 currently waiting for sub ops 2022-01-12T04:56:13.491+0100 7fd28e460700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312662.0:8092600 7.4d 7:b36b6fe0:::rbd_data.9fe7186b8b4567.0000000000001c22:head [write 2367488~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:40.907230+0100 currently waiting for sub ops 2022-01-12T04:56:13.491+0100 7fd28e460700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312785.0:96260005 7.45 7:a37c8881:::rbd_data.a002336b8b4567.0000000000000475:head [write 3395584~53248 in=53248b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:43.436038+0100 currently waiting for sub ops
[...]

--- OSD.12 node1
2022-01-12T04:53:55.440+0100 7f1f79d13700  0 log_channel(cluster) log [DBG] : 7.71 scrub ok 2022-01-12T04:56:08.999+0100 7f1f8f53e700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612313472.0:156390670 7.56 7:6ad61629:::rbd_data.6cd97d6b8b4567.00000000000001a0:head [write 1015808~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:38.959449+0100 currently waiting for sub ops 2022-01-12T04:56:08.999+0100 7f1f8f53e700 -1 osd.12 44032 get_health_metrics reporting 1 slow ops, oldest is osd_op(client.612313472.0:156390670 7.56 7:6ad61629:::rbd_data.6cd97d6b8b4567.00000000000001a0:head [write 1015808~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) 2022-01-12T04:56:09.999+0100 7f1f8f53e700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612313472.0:156390670 7.56 7:6ad61629:::rbd_data.6cd97d6b8b4567.00000000000001a0:head [write 1015808~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:38.959449+0100 currently waiting for sub ops 2022-01-12T04:56:09.999+0100 7f1f8f53e700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312938.0:57976478 7.73 7:ce752f82:::rbd_data.a001436b8b4567.0000000000000064:head [write 1855488~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:39.231786+0100 currently waiting for sub ops 2022-01-12T04:56:09.999+0100 7f1f8f53e700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312938.0:57976558 7.73 7:ce752f82:::rbd_data.a001436b8b4567.0000000000000064:head [write 425984~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:39.233680+0100 currently waiting for sub ops 2022-01-12T04:56:09.999+0100 7f1f8f53e700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312938.0:57976590 7.73 7:ce752f82:::rbd_data.a001436b8b4567.0000000000000064:head [write 3506176~4096 in=4096b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:39.234269+0100 currently waiting for sub ops 2022-01-12T04:56:09.999+0100 7f1f8f53e700 -1 osd.12 44032 get_health_metrics reporting 4 slow ops, oldest is osd_op(client.612313472.0:156390670 7.56 7:6ad61629:::rbd_data.6cd97d6b8b4567.00000000000001a0:head [write 1015808~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) 2022-01-12T04:56:10.951+0100 7f1f8f53e700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612313472.0:156390670 7.56 7:6ad61629:::rbd_data.6cd97d6b8b4567.00000000000001a0:head [write 1015808~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:38.959449+0100 currently waiting for sub ops 2022-01-12T04:56:10.951+0100 7f1f8f53e700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312938.0:57976478 7.73 7:ce752f82:::rbd_data.a001436b8b4567.0000000000000064:head [write 1855488~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:39.231786+0100 currently waiting for sub ops 2022-01-12T04:56:10.951+0100 7f1f8f53e700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312938.0:57976558 7.73 7:ce752f82:::rbd_data.a001436b8b4567.0000000000000064:head [write 425984~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:39.233680+0100 currently waiting for sub ops 2022-01-12T04:56:10.951+0100 7f1f8f53e700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612312938.0:57976590 7.73 7:ce752f82:::rbd_data.a001436b8b4567.0000000000000064:head [write 3506176~4096 in=4096b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:39.234269+0100 currently waiting for sub ops 2022-01-12T04:56:10.951+0100 7f1f8f53e700  0 log_channel(cluster) log [WRN] : slow request osd_op(client.612313472.0:156390838 7.77 7:efe2ab8b:::rbd_data.6cd97d6b8b4567.0000000000000299:head [write 3948544~16384 in=16384b] snapc 0=[] ondisk+write+known_if_redirected e44032) initiated 2022-01-12T04:55:40.502724+0100 currently waiting for sub ops
[...]


Now everything is fine and stable, but this incident affected our service.

Does this issue seem something known and/or can we provide additional info?

OSD.15 was recreated about 40 days ago because another issue (DB corruption, see another thread by me and Igor).

Thanks

Eneko Lacunza
Zuzendari teknikoa | Director técnico
Binovo IT Human Project

Tel. +34 943 569 206 | https://www.binovo.es
Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun

https://www.youtube.com/user/CANALBINOVO
https://www.linkedin.com/company/37269706/

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux