ceph bluestore RAM over used - luminous

Benoit GEORGELIN - yulPa <benoit.georgelin@xxxxxxxx> · Sun, 14 May 2017 01:57:41 +0200 (CEST)

Hi dear members of the list, 

I'm discovering CEPH and doing some testing. 
I came across a strange behavior about the RAM used by OSD process. 

Configuration : 
ceph version 12.0.2 
3xOSD nodes , 2 OSD by nodes, total 6 OSD and 6 Disks 
4Vcpu 
6Go de ram
64 PGS
Ubuntu 16.04

>From the documentation, 500Mb/1Gb is enough by OSD , but in my case, I don't really understand why, the OSDs consume a lot of RAM 
Sometimes, I can see, the RAM is released, from 4.5Go to 1.5Go but most of the time, it goes until swap and OSD crash :/ 
The more OSD the quicker it crash.. of course. I'm using an RBD image (100Go) mounted on a client with RBD map 
With only 2 OSD by nodes I can crash it in less than few iteration of : 

_TESTPATH="/rbd/ceph-sda1"
_BS="300M"
_COUNT="1"
_OFLAG="direct"
echo "####------------ `hostname -f` ------------###"; echo "###CMD: dd if=/dev/zero of=${_TESTPATH}/testperf${NUM} bs=${_BS} count=${_COUNT} oflag=${_OFLAG}"; echo "###TestPath: ${_TESTPATH}"; echo ""; for NUM in `seq 1 10`; do dd if=/dev/zero of=${_TESTPATH}/testperf${NUM} bs=${_BS} count=${_COUNT} oflag=${_OFLAG} && rm ${_TESTPATH}/testperf${NUM}; done 2>&1 | grep copi|sort -n ; 

Witch is 10 times in a row : 
dd if=/dev/zero of=/rbd/ceph-sda1/testperf bs=300M count=1 oflag=direct

Here is my ceph.conf :

### begin
[global]
fsid = 2d892cb4-7992-485c-b4e0-2242fa508461
mon_initial_members = int-ceph-mon1a-fr
mon_host = 10.101.240.137
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

public network = 10.101.240.0/24
cluster network = 10.101.0.0/24
enable experimental unrecoverable data corrupting features = bluestore rocksdb

#bluestore_debug_omit_block_device_write = true

[client]
rbd cache = true
rbd cache size = 67108864 # (64MB)
rbd cache max dirty = 50331648 # (48MB)
rbd cache target dirty = 33554432 # (32MB)
rbd cache max dirty age = 2
rbd cache writethrough until flush = true

[osd]
#Choose reasonable numbers for number of replicas and placement groups.
osd pool default size = 3 # Write an object 2 times
osd pool default min size = 2 # Allow writing 1 copy in a degraded state
osd pool default pg num = 64
osd pool default pgp num = 64

debug osd = 0
debug bluestore = 0
debug bluefs = 0 
debug rocksdb = 0
debug bdev = 0
bluestore = true
osd objectstore = bluestore 
#bluestore fsck on mount = true
bluestore block create = true
bluestore block db size = 67108864
bluestore block db create = true
bluestore block wal size = 134217728
bluestore block wal create = true

#osd journal size = 10000 default is to use all the device if not set

[osd.0]
        host = int-ceph-osd1a-fr
        public addr = 10.101.240.140
        cluster addr = 10.101.0.140
        osd data = /var/lib/ceph/osd/ceph-0/

[osd.1]
        host = int-ceph-osd1a-fr
        public addr = 10.101.240.140
        cluster addr = 10.101.0.140
        osd data = /var/lib/ceph/osd/ceph-1/

[osd.2]
        host = int-ceph-osd1b-fr
        public addr = 10.101.240.141
        cluster addr = 10.101.0.141
        osd data = /var/lib/ceph/osd/ceph-2/

[osd.3]
        host = int-ceph-osd1b-fr
        public addr = 10.101.240.141
        cluster addr = 10.101.0.141
        osd data = /var/lib/ceph/osd/ceph-3/

[osd.4]
        host = int-ceph-osd1c-fr
        public addr = 10.101.240.142
        cluster addr = 10.101.0.142
        osd data = /var/lib/ceph/osd/ceph-4/

[osd.5]
        host = int-ceph-osd1c-fr
        public addr = 10.101.240.142
        cluster addr = 10.101.0.142
        osd data = /var/lib/ceph/osd/ceph-5/

#### END 

Can you tell me if you see anything wrong here ? 
Is ceph supposed to "clear" the RAM before more quickly than what it does on my systems ?
>From what I see, most of the time, if does not free the RAM and crash my OSD 

Thanks a lot for your time and your help.

Regards, 

Benoît G, 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com