Hello Guys
We have a fresh 'luminous' ( 12.2.0 ) (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc) ( installed using ceph-ansible )
cluster contains 6 * Intel server board S2600WTTR ( 96 osds and 3 mons )
We have 6 nodes ( Intel server board S2600WTTR ) , Mem - 64G , CPU -> Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz , 32 cores .
Each server has 16 * 1.6TB Dell SSD drives ( SSDSC2BB016T7R ) , total of 96 osds , 3 mons
Main usage are rbd's for our openstack environment ( Okata )
We're in the beginning of our production tests and it looks like the osd's are too busy although we don't generate too much iops at this stage ( almost nothing )
All ceph-osds using 50% of cpu usage and i can't figure out why are they so busy :
top - 07:41:55 up 49 days, 2:54, 2 users, load average: 6.85, 6.40, 6.37
Tasks: 518 total, 1 running, 517 sleeping, 0 stopped, 0 zombie
%Cpu(s): 14.8 us, 4.3 sy, 0.0 ni, 80.3 id, 0.0 wa, 0.0 hi, 0.6 si, 0.0 st
KiB Mem : 65853584 total, 23953788 free, 40342680 used, 1557116 buff/cache
KiB Swap: 3997692 total, 3997692 free, 0 used. 18020584 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
36713 ceph 20 0 3869588 2.826g 28896 S 47.2 4.5 6079:20 ceph-osd
53981 ceph 20 0 3998732 2.666g 28628 S 45.8 4.2 5939:28 ceph-osd
55879 ceph 20 0 3707004 2.286g 28844 S 44.2 3.6 5854:29 ceph-osd
46026 ceph 20 0 3631136 1.930g 29100 S 43.2 3.1 6008:50 ceph-osd
39021 ceph 20 0 4091452 2.698g 28936 S 42.9 4.3 5687:39 ceph-osd
47210 ceph 20 0 3598572 1.871g 29092 S 42.9 3.0 5759:19 ceph-osd
52763 ceph 20 0 3843216 2.410g 28896 S 42.2 3.8 5540:11 ceph-osd
49317 ceph 20 0 3794760 2.142g 28932 S 41.5 3.4 5872:24 ceph-osd
42653 ceph 20 0 3915476 2.489g 28840 S 41.2 4.0 5605:13 ceph-osd
41560 ceph 20 0 3460900 1.801g 28660 S 38.5 2.9 5128:01 ceph-osd
50675 ceph 20 0 3590288 1.827g 28840 S 37.9 2.9 5196:58 ceph-osd
37897 ceph 20 0 4034180 2.814g 29000 S 34.9 4.5 4789:10 ceph-osd
50237 ceph 20 0 3379780 1.930g 28892 S 34.6 3.1 4846:36 ceph-osd
48608 ceph 20 0 3893684 2.721g 28880 S 33.9 4.3 4752:43 ceph-osd
40323 ceph 20 0 4227864 2.959g 28800 S 33.6 4.7 4712:36 ceph-osd
44638 ceph 20 0 3656780 2.437g 28896 S 33.2 3.9 4793:58 ceph-osd
61639 ceph 20 0 527512 114300 20988 S 2.7 0.2 2722:03 ceph-mgr
31586 ceph 20 0 765672 304140 21816 S 0.7 0.5 409:06.09 ceph-mon
68 root 20 0 0 0 0 S 0.3 0.0 3:09.69 ksoftirqd/12
strace doesn't show anything suspicious
root@ecprdbcph10-opens:~# strace -p 36713
strace: Process 36713 attached
futex(0x563343c56764, FUTEX_WAIT_PRIVATE, 1, NUL
Ceph logs doesn't reveal anything ?
Is this "normal" behavior in Luminous ?
Looking out in older threads i can only find a thread about time gaps which is not our case
Thanks In advance
Yair
This e-mail, as well as any attached document, may contain material which is confidential and privileged and may include trademark, copyright and other intellectual property rights that are proprietary to Kenshoo Ltd, its subsidiaries or affiliates ("Kenshoo"). This e-mail and its attachments may be read, copied and used only by the addressee for the purpose(s) for which it was disclosed herein. If you have received it in error, please destroy the message and any attachment, and contact us immediately. If you are not the intended recipient, be aware that any review, reliance, disclosure, copying, distribution or use of the contents of this message without Kenshoo's express permission is strictly prohibited.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com