Hi dear members of the list, I'm discovering CEPH and doing some testing. I came across a strange behavior about the RAM used by OSD process. Configuration : ceph version 12.0.2 3xOSD nodes , 2 OSD by nodes, total 6 OSD and 6 Disks 4Vcpu 6Go de ram 64 PGS Ubuntu 16.04 >From the documentation, 500Mb/1Gb is enough by OSD , but in my case, I don't really understand why, the OSDs consume a lot of RAM Sometimes, I can see, the RAM is released, from 4.5Go to 1.5Go but most of the time, it goes until swap and OSD crash :/ The more OSD the quicker it crash.. of course. I'm using an RBD image (100Go) mounted on a client with RBD map With only 2 OSD by nodes I can crash it in less than few iteration of : _TESTPATH="/rbd/ceph-sda1" _BS="300M" _COUNT="1" _OFLAG="direct" echo "####------------ `hostname -f` ------------###"; echo "###CMD: dd if=/dev/zero of=${_TESTPATH}/testperf${NUM} bs=${_BS} count=${_COUNT} oflag=${_OFLAG}"; echo "###TestPath: ${_TESTPATH}"; echo ""; for NUM in `seq 1 10`; do dd if=/dev/zero of=${_TESTPATH}/testperf${NUM} bs=${_BS} count=${_COUNT} oflag=${_OFLAG} && rm ${_TESTPATH}/testperf${NUM}; done 2>&1 | grep copi|sort -n ; Witch is 10 times in a row : dd if=/dev/zero of=/rbd/ceph-sda1/testperf bs=300M count=1 oflag=direct Here is my ceph.conf : ### begin [global] fsid = 2d892cb4-7992-485c-b4e0-2242fa508461 mon_initial_members = int-ceph-mon1a-fr mon_host = 10.101.240.137 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx public network = 10.101.240.0/24 cluster network = 10.101.0.0/24 enable experimental unrecoverable data corrupting features = bluestore rocksdb #bluestore_debug_omit_block_device_write = true [client] rbd cache = true rbd cache size = 67108864 # (64MB) rbd cache max dirty = 50331648 # (48MB) rbd cache target dirty = 33554432 # (32MB) rbd cache max dirty age = 2 rbd cache writethrough until flush = true [osd] #Choose reasonable numbers for number of replicas and placement groups. osd pool default size = 3 # Write an object 2 times osd pool default min size = 2 # Allow writing 1 copy in a degraded state osd pool default pg num = 64 osd pool default pgp num = 64 debug osd = 0 debug bluestore = 0 debug bluefs = 0 debug rocksdb = 0 debug bdev = 0 bluestore = true osd objectstore = bluestore #bluestore fsck on mount = true bluestore block create = true bluestore block db size = 67108864 bluestore block db create = true bluestore block wal size = 134217728 bluestore block wal create = true #osd journal size = 10000 default is to use all the device if not set [osd.0] host = int-ceph-osd1a-fr public addr = 10.101.240.140 cluster addr = 10.101.0.140 osd data = /var/lib/ceph/osd/ceph-0/ [osd.1] host = int-ceph-osd1a-fr public addr = 10.101.240.140 cluster addr = 10.101.0.140 osd data = /var/lib/ceph/osd/ceph-1/ [osd.2] host = int-ceph-osd1b-fr public addr = 10.101.240.141 cluster addr = 10.101.0.141 osd data = /var/lib/ceph/osd/ceph-2/ [osd.3] host = int-ceph-osd1b-fr public addr = 10.101.240.141 cluster addr = 10.101.0.141 osd data = /var/lib/ceph/osd/ceph-3/ [osd.4] host = int-ceph-osd1c-fr public addr = 10.101.240.142 cluster addr = 10.101.0.142 osd data = /var/lib/ceph/osd/ceph-4/ [osd.5] host = int-ceph-osd1c-fr public addr = 10.101.240.142 cluster addr = 10.101.0.142 osd data = /var/lib/ceph/osd/ceph-5/ #### END Can you tell me if you see anything wrong here ? Is ceph supposed to "clear" the RAM before more quickly than what it does on my systems ? >From what I see, most of the time, if does not free the RAM and crash my OSD Thanks a lot for your time and your help. Regards, Benoît G, _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com