Hello Guys
We have a fresh 'luminous' ( 12.2.0 ) (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc) ( installed using ceph-ansible )
the cluster contains 6 * Intel server board S2600WTTR ( 96 osds and 3 mons )
We have 6 nodes ( Intel server board S2600WTTR ) , Mem - 64G , CPU -> Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz , 32 cores .
Each server has 16 * 1.6TB Dell SSD drives ( SSDSC2BB016T7R ) , total of 96 osds , 3 mons
The main usage is rbd's for our OpenStack environment ( Okata )
We're at the beginning of our production tests and it looks like the osd's are too busy although we don't generate too much iops at this stage ( almost nothing )
All ceph-osds using 50% of CPU usage and I can't figure out why are they so busy :
top - 07:41:55 up 49 days, 2:54, 2 users, load average: 6.85, 6.40, 6.37
Tasks: 518 total, 1 running, 517 sleeping, 0 stopped, 0 zombie
%Cpu(s): 14.8 us, 4.3 sy, 0.0 ni, 80.3 id, 0.0 wa, 0.0 hi, 0.6 si, 0.0 st
KiB Mem : 65853584 total, 23953788 free, 40342680 used, 1557116 buff/cache
KiB Swap: 3997692 total, 3997692 free, 0 used. 18020584 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
36713 ceph 20 0 3869588 2.826g 28896 S 47.2 4.5 6079:20 ceph-osd
53981 ceph 20 0 3998732 2.666g 28628 S 45.8 4.2 5939:28 ceph-osd
55879 ceph 20 0 3707004 2.286g 28844 S 44.2 3.6 5854:29 ceph-osd
46026 ceph 20 0 3631136 1.930g 29100 S 43.2 3.1 6008:50 ceph-osd
39021 ceph 20 0 4091452 2.698g 28936 S 42.9 4.3 5687:39 ceph-osd
47210 ceph 20 0 3598572 1.871g 29092 S 42.9 3.0 5759:19 ceph-osd
52763 ceph 20 0 3843216 2.410g 28896 S 42.2 3.8 5540:11 ceph-osd
49317 ceph 20 0 3794760 2.142g 28932 S 41.5 3.4 5872:24 ceph-osd
42653 ceph 20 0 3915476 2.489g 28840 S 41.2 4.0 5605:13 ceph-osd
41560 ceph 20 0 3460900 1.801g 28660 S 38.5 2.9 5128:01 ceph-osd
50675 ceph 20 0 3590288 1.827g 28840 S 37.9 2.9 5196:58 ceph-osd
37897 ceph 20 0 4034180 2.814g 29000 S 34.9 4.5 4789:10 ceph-osd
50237 ceph 20 0 3379780 1.930g 28892 S 34.6 3.1 4846:36 ceph-osd
48608 ceph 20 0 3893684 2.721g 28880 S 33.9 4.3 4752:43 ceph-osd
40323 ceph 20 0 4227864 2.959g 28800 S 33.6 4.7 4712:36 ceph-osd
44638 ceph 20 0 3656780 2.437g 28896 S 33.2 3.9 4793:58 ceph-osd
61639 ceph 20 0 527512 114300 20988 S 2.7 0.2 2722:03 ceph-mgr
31586 ceph 20 0 765672 304140 21816 S 0.7 0.5 409:06.09 ceph-mon
68 root 20 0 0 0 0 S 0.3 0.0 3:09.69 ksoftirqd/12
strace doesn't show anything suspicious
root@ecprdbcph10-opens:~# strace -p 36713
strace: Process 36713 attached
futex(0x563343c56764, FUTEX_WAIT_PRIVATE, 1, NUL
Ceph logs don't reveal anything?
Is this "normal" behavior in Luminous?
Looking out in older threads I can only find a thread about time gaps which is not our case
Thanks,
Alon
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com