Hello All,
I'm hoping I can get some help with an issue in the dashboard after
doing a recent bare metal ceph upgrade from
Octopus to Quincy.
** Please note, this document references it only being an issue with
the images tab shortly after this I found the same issue on another
cluster that was recently upgraded from Octopus to quincy 17.2.7
within the last few months and it's affecting all tabs in the ceph
dashboard it slows to a crawl until I restart or fail over the mgr
both running on top of ubuntu 20.04
Everything appears to be working fine besides the Block --> images
tab. It doesn't matter what node I fail over to,
reboots, reinstalling ceph-mgr-dashboard, different broswers, clients etc
It will not load the 4 RBDs I have, they appear in rbd ls, I can
query them and the connection on the end appliance is
fine. The loading icons spin infinitely without any failure message.
If I access the images tab and then move to
any other tab is the dashboard it will allow me to navigate but not
display anything until I either restart the service
on the active mgr or fail over to another, so it works as expected
until I access this one tab.
when I use any other section in the in the dashboard cpu utilization
for the ceph-mgr is normal but when I
access the images tab it's spiked to as high as 600% and will stay
like that until I restart the service or fail
over the active mgr
-- Active MGR before clicking Block, the OSDs spike for a second but
revert to around 5%
top - 13:43:37 up 8 days, 23:09, 1 user, load average: 8.08, 5.02, 4.37
Tasks: 695 total, 1 running, 694 sleeping, 0 stopped, 0 zombie
%Cpu(s): 7.0 us, 1.6 sy, 0.0 ni, 89.7 id, 1.2 wa, 0.0 hi, 0.5
si, 0.0 st
-----
MiB Mem : 128474.1 total, 6705.6 free, 65684.0 used, 56084.5 buff/cache
MiB Swap: 40927.0 total, 35839.3 free, 5087.7 used. 49253.0 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM
TIME+ COMMAND
14156 ceph 20 0 3420632 1.9g 13668 S 55.3 1.5
864:49.51 ceph-osd
13762 ceph 20 0 3421384 1.8g 13432 S 51.3 1.4
960:22.12 ceph-osd
14163 ceph 20 0 3422352 1.7g 13016 S 50.0 1.3
902:41.19 ceph-osd
13803 ceph 20 0 3469596 1.8g 13532 S 44.7 1.4
941:55.10 ceph-osd
13774 ceph 20 0 3427560 1.7g 13656 S 38.7 1.4
932:02.51 ceph-osd
13801 ceph 20 0 3439796 1.7g 13448 S 37.7 1.3
981:25.55 ceph-osd
14025 ceph 20 0 3426360 1.8g 13780 S 36.4 1.4
994:00.75 ceph-osd
9888 nobody 20 0 126100 8696 0 S 21.2 0.0
1106:19 node_exporter
126798 ceph 20 0 1787824 528000 39464 S 7.9 0.4
0:14.84 ceph-mgr
13795 ceph 20 0 3420252 1.7g 13264 S 7.6 1.4
990:00.61 ceph-osd
13781 ceph 20 0 3484476 1.9g 13248 S 6.3 1.5
1040:10 ceph-osd
13777 ceph 20 0 3408972 1.8g 13464 S 6.0 1.5
1026:21 ceph-osd
13797 ceph 20 0 3432068 1.6g 13932 S 6.0 1.3
950:39.35 ceph-osd
13779 ceph 20 0 3471668 1.7g 12728 S 5.6 1.3
984:53.80 ceph-osd
13768 ceph 20 0 3496064 1.9g 13504 S 5.3 1.5
918:37.48 ceph-osd
13786 ceph 20 0 3422044 1.6g 13456 S 5.3 1.3
974:29.08 ceph-osd
13788 ceph 20 0 3454184 1.9g 13048 S 5.3 1.5
980:35.78 ceph-osd
13776 ceph 20 0 3445680 1.7g 12880 S 5.0 1.3
998:30.58 ceph-osd
13785 ceph 20 0 3409548 1.7g 13704 S 5.0 1.3
939:37.08 ceph-osd
14152 ceph 20 0 3465284 1.7g 13840 S 5.0 1.4
959:39.42 ceph-osd
10339 nobody 20 0 6256048 531428 60188 S 4.6 0.4
239:37.56 prometheus
13802 ceph 20 0 3430696 1.8g 13872 S 4.6 1.4
924:15.74 ceph-osd
13791 ceph 20 0 3498876 1.5g 12648 S 4.3 1.2
962:58.37 ceph-osd
13800 ceph 20 0 3455268 1.7g 12404 S 4.3 1.3
1000:41 ceph-osd
13790 ceph 20 0 3434364 1.6g 13516 S 3.3 1.3
974:16.46 ceph-osd
14217 ceph 20 0 3443436 1.8g 13560 S 3.3 1.4
902:54.22 ceph-osd
13526 ceph 20 0 1012048 499628 11244 S 3.0 0.4
349:35.28 ceph-mon
13775 ceph 20 0 3367284 1.6g 13940 S 3.0 1.3
878:38.27 ceph-osd
13784 ceph 20 0 3380960 1.8g 12892 S 3.0 1.4
910:50.47 ceph-osd
13789 ceph 20 0 3432876 1.6g 12464 S 2.6 1.2
922:45.15 ceph-osd
13804 ceph 20 0 3428120 1.9g 13192 S 2.6 1.5
865:31.30 ceph-osd
14153 ceph 20 0 3432752 1.8g 12576 S 2.3 1.4
874:27.92 ceph-osd
14192 ceph 20 0 3412640 1.9g 13512 S 2.3 1.5
923:01.97 ceph-osd
13796 ceph 20 0 3433016 1.8g 13164 S 2.0 1.4
982:08.21 ceph-osd
13798 ceph 20 0 3405708 1.6g 13508 S 2.0 1.3
873:50.34 ceph-osd
13814 ceph 20 0 4243252 1.5g 13500 S 2.0 1.2
2020:41 ceph-osd
13985 ceph 20 0 3487848 1.6g 13100 S 2.0 1.3
942:21.96 ceph-osd
14001 ceph 20 0 4194336 1.9g 13460 S 2.0 1.5
2143:46 ceph-osd
14186 ceph 20 0 3441852 1.5g 13360 S 2.0 1.2
956:30.81 ceph-osd
7257 root 20 0 82332 3480 2984 S 0.3 0.0
9:22.50 irqbalance
7269 syslog 20 0 224344 3648 2392 S 0.3 0.0
0:11.79 rsyslogd
16621 472 20 0 898376 79800 11080 S 0.3 0.1
146:36.08 grafana
104366 root 20 0 0 0 0 I 0.3 0.0
1:11.30 kworker/0:2-
events
125676 root 20 0 0 0 0 I 0.3 0.0
0:00.48 kworker/u104:7-
public-bond
127115 root 20 0 10172 4636 3392 R 0.3 0.0 0:00.10 top
1 root 20 0 180712 16312 5652 S 0.0 0.0
1:31.85 systemd
2 root 20 0 0 0 0 S 0.0 0.0
0:00.15 kthreadd
---
top - 13:44:28 up 8 days, 23:09, 1 user, load average: 6.79, 5.11, 4.43
Tasks: 695 total, 1 running, 694 sleeping, 0 stopped, 0 zombie
%Cpu(s): 12.9 us, 2.1 sy, 0.0 ni, 83.5 id, 1.0 wa, 0.0 hi, 0.5
si, 0.0 st
MiB Mem : 128474.1 total, 6219.8 free, 66504.7 used, 55749.6 buff/cache
MiB Swap: 40927.0 total, 35837.0 free, 5090.0 used. 48435.7 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
126798 ceph 20 0 1854596 573432 41484 S 482.5 0.4
1:45.70 ceph-mgr
14156 ceph 20 0 3420632 1.9g 13668 S 54.6 1.5
865:16.58 ceph-osd
13762 ceph 20 0 3421384 1.9g 13432 S 51.7 1.5
960:47.47 ceph-osd
13803 ceph 20 0 3469596 1.9g 13532 S 49.3 1.5
942:18.23 ceph-osd
14163 ceph 20 0 3422352 1.9g 13016 S 49.3 1.5
903:05.84 ceph-osd
13795 ceph 20 0 3420252 1.8g 13264 S 7.0 1.4
990:04.41 ceph-osd
13777 ceph 20 0 3408972 1.9g 13464 S 5.6 1.5
1026:24 ceph-osd
13786 ceph 20 0 3422044 1.6g 13456 S 5.3 1.3
974:32.09 ceph-osd
13797 ceph 20 0 3432068 1.6g 13932 S 5.3 1.3
950:42.02 ceph-osd
16621 472 20 0 898376 78776 11044 S 5.3 0.1
146:36.76 grafana
13791 ceph 20 0 3498876 1.5g 12648 S 5.0 1.2
963:00.94 ceph-osd
14001 ceph 20 0 4194336 1.9g 13460 S 5.0 1.5
2143:47 ceph-osd
9888 nobody 20 0 126100 8696 0 S 4.6 0.0
1106:23 node_exporter
13768 ceph 20 0 3496064 1.9g 13504 S 4.6 1.5
918:39.85 ceph-osd
13776 ceph 20 0 3445680 1.7g 12880 S 4.6 1.3
998:32.98 ceph-osd
13781 ceph 20 0 3484476 1.6g 13248 S 4.6 1.3
1040:13 ceph-osd
13785 ceph 20 0 3409548 1.7g 13704 S 4.6 1.3
939:40.08 ceph-osd
13788 ceph 20 0 3454184 1.9g 13048 S 4.6 1.5
980:38.24 ceph-osd
13779 ceph 20 0 3471668 1.7g 12728 S 4.3 1.3
984:56.39 ceph-osd
13800 ceph 20 0 3455268 1.7g 12404 S 4.3 1.3
1000:44 ceph-osd
13802 ceph 20 0 3430696 1.8g 13872 S 4.0 1.4
924:18.14 ceph-osd
14152 ceph 20 0 3465284 1.7g 13840 S 4.0 1.4
959:41.83 ceph-osd
13796 ceph 20 0 3433016 1.8g 13164 S 3.0 1.4
982:09.66 ceph-osd
13784 ceph 20 0 3380960 1.8g 12892 S 2.6 1.4
910:52.06 ceph-osd
13790 ceph 20 0 3434364 1.6g 13516 S 2.6 1.3
974:17.99 ceph-osd
13801 ceph 20 0 3439796 1.9g 13448 S 2.6 1.5
981:42.61 ceph-osd
14153 ceph 20 0 3432752 1.8g 12576 S 2.6 1.4
874:29.30 ceph-osd
14186 ceph 20 0 3441852 1.6g 13360 S 2.6 1.2
956:32.32 ceph-osd
14192 ceph 20 0 3412640 1.9g 13512 S 2.6 1.5
923:03.59 ceph-osd
13526 ceph 20 0 1012048 496332 11208 S 2.3 0.4
349:36.89 ceph-mon
13789 ceph 20 0 3432876 1.6g 12464 S 2.3 1.2
922:46.59 ceph-osd
13798 ceph 20 0 3405708 1.6g 13508 S 2.3 1.3
873:51.73 ceph-osd
14217 ceph 20 0 3443436 1.8g 13560 S 2.3 1.4
902:55.74 ceph-osd
13774 ceph 20 0 3427560 1.9g 13656 S 2.0 1.5
932:19.26 ceph-osd
13775 ceph 20 0 3367284 1.6g 13940 S 2.0 1.3
878:39.76 ceph-osd
13814 ceph 20 0 4243252 1.5g 13500 S 2.0 1.2
2020:42 ceph-osd
13985 ceph 20 0 3487848 1.6g 13100 S 2.0 1.3
942:23.39 ceph-osd
13804 ceph 20 0 3428120 1.9g 13192 S 1.7 1.5
865:32.71 ceph-osd
14025 ceph 20 0 3426360 1.8g 13780 S 1.7 1.5
994:17.32 ceph-osd
10339 nobody 20 0 6256048 537184 60136 S 1.0 0.4
239:38.44 prometheus
17547 nobody 20 0 128448 8572 0 S 1.0 0.0
31:54.99 alertmanager
127115 root 20 0 10172 4636 3392 R 0.7 0.0 0:00.43 top
---
OS: Ubuntu 20.04
128GB of memory
Intel(R) Xeon(R) Gold 6230R CPU @ 2.10GHz
X11SPL-F
---
Ceph Versions
{
"mon": {
"ceph version 17.2.7
(b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
(stable)": 3
},
"mgr": {
"ceph version 17.2.7
(b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
(stable)": 4
},
"osd": {
"ceph version 17.2.7
(b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
(stable)": 140
},
"mds": {
"ceph version 17.2.7
(b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
(stable)": 4
},
"ctdb": {
"ceph version 17.2.7
(b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
(stable)": 1
},
"rgw": {
"ceph version 17.2.7
(b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
(stable)": 2
},
"overall": {
"ceph version 17.2.7
(b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
(stable)": 154
}
}
---
cluster:
id: 388dda42-9dd0-4858-a978-b3dc4c3b9152
health: HEALTH_OK
services:
mon: 3 daemons, quorum jarn29,jarn30,jarn31 (age 8d)
mgr: osd31(active, since 10m), standbys: osd30, osd29, osd32
mds: 2/2 daemons up, 2 standby
osd: 140 osds: 140 up (since 3d), 140 in (since 3d)
flags noautoscale
ctdb: 1 daemon active (1 hosts)
rgw: 2 daemons active (2 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 19 pools, 5024 pgs
objects: 792.78M objects, 865 TiB
usage: 1.4 PiB used, 586 TiB / 2.0 PiB avail
pgs: 5013 active+clean
11 active+clean+scrubbing+deep
io:
client: 19 MiB/s rd, 102 MiB/s wr, 672 op/s rd, 351 op/s wr
---
I've ran perf and included the ceph-mgr.log for the first system
that is displaying the issue
Perf - https://imgur.com/a/VMh4tDf
mgr log while accessing RBD tab - https://pastebin.com/t96WCWfc
mgr logs prior to clicking RBD -https://pastebin.com/e4dtuD3i
---
Apologies for the formatting first time posting here
If anything else is needed please let me know!
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx