And can see this in error log :
Feb 2 16:41:28 ceph-las1-a4-osd kernel: bstore_kv_sync: page allocation stalls for 14188ms, order:0, mode:0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null)
Karun Josy
On Sun, Feb 4, 2018 at 6:19 AM, Karun Josy <karunjosy1@xxxxxxxxx> wrote:
Hi,We are using EC profile in our cluster.We are seeing very high RAM usage in 1 OSD server.Sometimes it goes too low and server hangs. We have to restart the daemons which frees up the memory, but in very short time get used up againMemory usage of daemons from issue server-------------PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND16918 ceph 20 0 15.780g 0.013t 7928 S 28.2 21.9 67:29.09 ceph-osd18568 ceph 20 0 25.833g 0.023t 26096 S 24.9 36.8 9:15.58 ceph-osd22630 ceph 20 0 12.520g 0.011t 26660 S 22.3 18.3 5:49.03 ceph-osd2796 ceph 20 0 11.091g 9.851g 8900 S 13.6 15.7 25:17.68 ceph-osdMemory usage from another server :------------------------------- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND11649 ceph 20 0 12.788g 9.563g 25068 S 107.6 7.6 12285:54 ceph-osd18295 ceph 20 0 11.028g 6.069g 26212 S 54.0 4.8 2122:18 ceph-osd30974 ceph 20 0 13.860g 0.010t 24956 S 46.4 8.1 10984:47 ceph-osdWe are using ec profile 5/3. And there are 2 failed disks in the cluster in another nodes, (I have marked them down, but not out) so cannot turn this node off as it will force some pgs to be incomplete state.And help would be really appreciated.Karun Josy
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com