----- Mail original ----- > De: "Benoit GEORGELIN" <benoit.georgelin@xxxxxxxx> > À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx> > Envoyé: Samedi 13 Mai 2017 19:57:41 > Objet: ceph bluestore RAM over used - luminous > Hi dear members of the list, > > I'm discovering CEPH and doing some testing. > I came across a strange behavior about the RAM used by OSD process. > > Configuration : > ceph version 12.0.2 > 3xOSD nodes , 2 OSD by nodes, total 6 OSD and 6 Disks > 4Vcpu > 6Go de ram > 64 PGS > Ubuntu 16.04 > > From the documentation, 500Mb/1Gb is enough by OSD , but in my case, I don't > really understand why, the OSDs consume a lot of RAM > Sometimes, I can see, the RAM is released, from 4.5Go to 1.5Go but most of the > time, it goes until swap and OSD crash :/ > The more OSD the quicker it crash.. of course. I'm using an RBD image (100Go) > mounted on a client with RBD map > With only 2 OSD by nodes I can crash it in less than few iteration of : > > > _TESTPATH="/rbd/ceph-sda1" > _BS="300M" > _COUNT="1" > _OFLAG="direct" > echo "####------------ `hostname -f` ------------###"; echo "###CMD: dd > if=/dev/zero of=${_TESTPATH}/testperf${NUM} bs=${_BS} count=${_COUNT} > oflag=${_OFLAG}"; echo "###TestPath: ${_TESTPATH}"; echo ""; for NUM in `seq 1 > 10`; do dd if=/dev/zero of=${_TESTPATH}/testperf${NUM} bs=${_BS} > count=${_COUNT} oflag=${_OFLAG} && rm ${_TESTPATH}/testperf${NUM}; done 2>&1 | > grep copi|sort -n ; > > Witch is 10 times in a row : > dd if=/dev/zero of=/rbd/ceph-sda1/testperf bs=300M count=1 oflag=direct > > > Here is my ceph.conf : > > > ### begin > [global] > fsid = 2d892cb4-7992-485c-b4e0-2242fa508461 > mon_initial_members = int-ceph-mon1a-fr > mon_host = 10.101.240.137 > auth_cluster_required = cephx > auth_service_required = cephx > auth_client_required = cephx > > public network = 10.101.240.0/24 > cluster network = 10.101.0.0/24 > enable experimental unrecoverable data corrupting features = bluestore rocksdb > > #bluestore_debug_omit_block_device_write = true > > [client] > rbd cache = true > rbd cache size = 67108864 # (64MB) > rbd cache max dirty = 50331648 # (48MB) > rbd cache target dirty = 33554432 # (32MB) > rbd cache max dirty age = 2 > rbd cache writethrough until flush = true > > [osd] > #Choose reasonable numbers for number of replicas and placement groups. > osd pool default size = 3 # Write an object 2 times > osd pool default min size = 2 # Allow writing 1 copy in a degraded state > osd pool default pg num = 64 > osd pool default pgp num = 64 > > > debug osd = 0 > debug bluestore = 0 > debug bluefs = 0 > debug rocksdb = 0 > debug bdev = 0 > bluestore = true > osd objectstore = bluestore > #bluestore fsck on mount = true > bluestore block create = true > bluestore block db size = 67108864 > bluestore block db create = true > bluestore block wal size = 134217728 > bluestore block wal create = true > > #osd journal size = 10000 default is to use all the device if not set > > [osd.0] > host = int-ceph-osd1a-fr > public addr = 10.101.240.140 > cluster addr = 10.101.0.140 > osd data = /var/lib/ceph/osd/ceph-0/ > > [osd.1] > host = int-ceph-osd1a-fr > public addr = 10.101.240.140 > cluster addr = 10.101.0.140 > osd data = /var/lib/ceph/osd/ceph-1/ > > > [osd.2] > host = int-ceph-osd1b-fr > public addr = 10.101.240.141 > cluster addr = 10.101.0.141 > osd data = /var/lib/ceph/osd/ceph-2/ > > [osd.3] > host = int-ceph-osd1b-fr > public addr = 10.101.240.141 > cluster addr = 10.101.0.141 > osd data = /var/lib/ceph/osd/ceph-3/ > > [osd.4] > host = int-ceph-osd1c-fr > public addr = 10.101.240.142 > cluster addr = 10.101.0.142 > osd data = /var/lib/ceph/osd/ceph-4/ > > [osd.5] > host = int-ceph-osd1c-fr > public addr = 10.101.240.142 > cluster addr = 10.101.0.142 > osd data = /var/lib/ceph/osd/ceph-5/ > > #### END > > > Can you tell me if you see anything wrong here ? > Is ceph supposed to "clear" the RAM before more quickly than what it does on my > systems ? > From what I see, most of the time, if does not free the RAM and crash my OSD > > Thanks a lot for your time and your help. > > Regards, > > > > Benoît G, > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com This is the errors I get in OSD : 0> 2017-05-14 23:55:01.910834 7f0338157700 -1 /build/ceph-12.0.2/src/os/bluestore/KernelDevice.cc: In function 'void KernelDevice::_aio_thread()' thread 7f0338157700 time 2017-05-14 23:55:01.907393 /build/ceph-12.0.2/src/os/bluestore/KernelDevice.cc: 364: FAILED assert(r >= 0) ceph version 12.0.2 (5a1b6b3269da99a18984c138c23935e5eb96f73e) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x55b37a657072] 2: (KernelDevice::_aio_thread()+0x1301) [0x55b37a5dcc61] 3: (KernelDevice::AioCompletionThread::entry()+0xd) [0x55b37a5df45d] 4: (()+0x76ba) [0x7f034262e6ba] 5: (clone()+0x6d) [0x7f03416a582d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. [...] 2017-05-14 23:55:01.963223 7f0338157700 -1 *** Caught signal (Aborted) ** in thread 7f0338157700 thread_name:bstore_aio ceph version 12.0.2 (5a1b6b3269da99a18984c138c23935e5eb96f73e) 1: (()+0xcab9b2) [0x55b37a5f39b2] 2: (()+0x11390) [0x7f0342638390] 3: (gsignal()+0x38) [0x7f03415d4428] 4: (abort()+0x16a) [0x7f03415d602a] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x55b37a6571fe] 6: (KernelDevice::_aio_thread()+0x1301) [0x55b37a5dcc61] 7: (KernelDevice::AioCompletionThread::entry()+0xd) [0x55b37a5df45d] 8: (()+0x76ba) [0x7f034262e6ba] 9: (clone()+0x6d) [0x7f03416a582d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. No ideas about this problem about RAM usage ? I guess the errors are because of the ram totally consumed .. After that error i'm not able to recover anything.. So I you consider this as a bug, I'll open issue. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com