Hi,
I’m trying to setup one-way rbd-mirroring for a ceph-cluster used by an openstack cloud, but the rbd-mirror is unable to “catch up” with the changes. However it appears to me as if it's not due to the ceph-clusters or the network but due to the server running the rbd-mirror process running out of cpu?
Is a high cpu load to be expected or is it a symptom of something else?
Or in other words, what can I check/do to get the mirroring working? 😊
# rbd mirror pool status nova
health: WARNING
images: 596 total
572 starting_replay
24 replaying
top - 13:31:36 up 79 days, 5:31, 1 user, load average: 32.27, 26.82, 25.33
Tasks: 360 total, 17 running, 182 sleeping, 0 stopped, 0 zombie
%Cpu(s): 8.9 us, 70.0 sy, 0.0 ni, 18.5 id, 0.0 wa, 0.0 hi, 2.7 si, 0.0 st
KiB Mem : 13205185+total, 12862490+free, 579508 used, 2847444 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 12948856+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2336553 ceph 20 0 17.1g 178160 20344 S 417.2 0.1 21:50.61 rbd-mirror
2312698 root 20 0 0 0 0 I 70.2 0.0 70:11.51 kworker/12:2
2312851 root 20 0 0 0 0 R 69.2 0.0 62:29.69 kworker/24:1
2324627 root 20 0 0 0 0 I 68.4 0.0 40:36.77 kworker/14:1
2235817 root 20 0 0 0 0 I 68.0 0.0 469:14.08 kworker/8:0
2241720 root 20 0 0 0 0 R 67.3 0.0 437:46.51 kworker/9:1
2306648 root 20 0 0 0 0 R 66.9 0.0 109:27.44 kworker/25:0
2324625 root 20 0 0 0 0 R 66.9 0.0 40:37.53 kworker/13:1
2336318 root 20 0 0 0 0 R 66.7 0.0 14:51.96 kworker/27:3
2324643 root 20 0 0 0 0 I 66.5 0.0 36:21.46 kworker/15:2
2294989 root 20 0 0 0 0 I 66.3 0.0 134:09.89 kworker/11:1
2324626 root 20 0 0 0 0 I 66.3 0.0 39:44.14 kworker/28:2
2324019 root 20 0 0 0 0 I 65.3 0.0 44:51.80 kworker/26:1
2235814 root 20 0 0 0 0 R 65.1 0.0 459:14.70 kworker/29:2
2294174 root 20 0 0 0 0 I 64.5 0.0 220:58.50 kworker/30:1
2324355 root 20 0 0 0 0 R 63.3 0.0 45:04.29 kworker/10:1
2263800 root 20 0 0 0 0 R 62.9 0.0 353:38.48 kworker/31:1
2270765 root 20 0 0 0 0 R 60.2 0.0 294:46.34 kworker/0:0
2294798 root 20 0 0 0 0 R 59.8 0.0 148:48.23 kworker/1:2
2307128 root 20 0 0 0 0 R 59.8 0.0 86:15.45 kworker/6:2
2307129 root 20 0 0 0 0 I 59.6 0.0 85:29.66 kworker/5:0
2294826 root 20 0 0 0 0 R 58.2 0.0 138:53.56 kworker/7:3
2294575 root 20 0 0 0 0 I 57.8 0.0 155:03.74 kworker/2:3
2294310 root 20 0 0 0 0 I 57.2 0.0 176:10.92 kworker/4:2
2295000 root 20 0 0 0 0 I 57.2 0.0 132:47.28 kworker/3:2
2307060 root 20 0 0 0 0 I 56.6 0.0 87:46.59 kworker/23:2
2294931 root 20 0 0 0 0 I 56.4 0.0 133:31.47 kworker/17:2
2318659 root 20 0 0 0 0 I 56.2 0.0 55:01.78 kworker/16:2
2336304 root 20 0 0 0 0 I 56.0 0.0 11:45.92 kworker/21:2
2306947 root 20 0 0 0 0 R 55.6 0.0 90:45.31 kworker/22:2
2270628 root 20 0 0 0 0 I 53.8 0.0 273:43.31 kworker/19:3
2294797 root 20 0 0 0 0 R 52.3 0.0 141:13.67 kworker/18:0
2330537 root 20 0 0 0 0 R 52.3 0.0 25:33.25 kworker/20:2
The main cluster has 12 nodes with 120 OSDs and the backup cluster has 6 nodes with 60 OSDs (but roughly the same amount of storage), the rbd-mirror runs on a separate server with 2* E5-2650v2 cpus and 128GB memory.
Best regards
/Magnus
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com