Hi All, I am trying to configure rocksdb as objectstore backend on a cluster with ceph version 0.91-375-g2a4cbfc. I built ceph using' make-debs.sh' which builds the source with --with-rocksdb option. I was able to get the cluster up and running with rockdbs as a backend, however as soon as I started dumping data on cluster using radosbench , cluster become miserable just after 10 sec of write I/Os. Some OSD daemons marked down randomly for no apparent reason. Even if I make all daemons start/up again , after some time some daemons marked down again randomly.Recovery i/o does the job this time , that external i/o done before. What could be the possible problem and solution for this behaviour? Some more details: 1. Setup is 3 OSD nodes with 10 SanDisk Optimus Eco (400GB) each.Drives were working fine with filestore backend. 2. 3 Monitors and 1 client from which I am running RadosBench. 3. Ubuntu14.04 on each node. (3.13.0-24-generic) 4. I create OSDs on each nodes using below script(of course with different osd numbers):- ################################## #!/bin/bash sudo stop ceph-osd-all ps -eaf|grep osd |awk '{print $2}'|xargs sudo kill -9 osd_num=(0 1 2 3 4 5 6 7 8 9) drives=(sdb1 sdc1 sdd1 sde1 sdf1 sdg1 sdh1 sdi1 sdj1 sdk1) node="rack6-storage-1" for ((i=0;i<10;i++)) do sudo ceph osd rm ${osd_num[i]} sudo ceph osd crush rm osd.${osd_num[i]} sudo ceph auth del osd.${osd_num[i]} sudo umount -f /var/lib/ceph/osd/ceph-${osd_num[i]} ceph osd create sudo rm -rf /var/lib/ceph/osd/ceph-${osd_num[i]} sudo mkdir -p /var/lib/ceph/osd/ceph-${osd_num[i]} sudo mkfs.xfs -f -i size=2048 /dev/${drives[i]} sudo mount -o rw,noatime,inode64,logbsize=256k,delaylog /dev/${drives[i]} /var/lib/ceph/osd/ceph-${osd_num[i]} sudo ceph osd crush add osd.${osd_num[i]} 1 root=default host=$node sudo sudo ceph-osd --id ${osd_num[i]} -d --mkkey --mkfs --osd-data /var/lib/ceph/osd/ceph-${osd_num[i]} ceph auth add osd.${osd_num[i]} osd 'allow *' mon 'allow profile osd' -i /var/lib/ceph/osd/ceph-${osd_num[i]}/keyring sudo sudo ceph-osd -i ${osd_num[i]} done ################################### 5. Some configs that might be relevant are as follows:- ######### enable_experimental_unrecoverable_data_corrupting_features = keyvaluestore osd_objectstore = keyvaluestore keyvaluestore_backend = rocksdb keyvaluestore queue max ops = 500 keyvaluestore queue max bytes = 100 keyvaluestore header cache size = 2048 keyvaluestore op threads = 10 keyvaluestore_max_expected_write_size = 4096000 leveldb_write_buffer_size = 33554432 leveldb_cache_size = 536870912 leveldb_bloom_size = 0 leveldb_max_open_files = 10240 leveldb_compression = false leveldb_paranoid = false leveldb_log = /dev/null leveldb_compact_on_mount = false rocksdb_write_buffer_size = 33554432 rocksdb_cache_size = 536870912 rocksdb_bloom_size = 0 rocksdb_max_open_files = 10240 rocksdb_compression = false rocksdb_paranoid = false rocksdb_log = /dev/null rocksdb_compact_on_mount = false ######### 6. Objects get stored in *.sst files, seems rocksbd is configured correctly:- ls -l /var/lib/ceph/osd/ceph-20/current/ |more total 3169352 -rw-r--r-- 1 root root 2128430 Jan 20 00:04 000031.sst -rw-r--r-- 1 root root 2128430 Jan 20 00:04 000033.sst -rw-r--r-- 1 root root 2128431 Jan 20 00:04 000035.sst ............ 7. This is current state of cluster:- ################ monmap e1: 3 mons at {rack6-ramp-1=10.x.x.x:6789/0,rack6-ramp-2=10.x.x.x:6789/0,rack6-ramp-3=10.x.x.x:6789/0} election epoch 16, quorum 0,1,2 rack6-ramp-1,rack6-ramp-2,rack6-ramp-3 osdmap e547: 30 osds: 8 up, 8 in pgmap v1059: 512 pgs, 1 pools, 18252 MB data, 4563 objects 22856 MB used, 2912 GB / 2934 GB avail 1587/13689 objects degraded (11.593%) 419/13689 objects misplaced (3.061%) 26/4563 unfound (0.570%) ################# I would be happy to provide any other information that is needed. -- -Pushpesh -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html