Ok, weird problem,(s) if you want to call it that.. So i run a 10 OSD Ceph cluster on 4 hosts with SSDs (Intel DC3700) as journals. I have a lot of mixed workloads running and the linux machines seem to get somehow corrupted in a weird way and the performance kind of sucks. First off: All hosts are running Openstack with KVM + libvirt to connect and boot the RBD volumes. Ceph -v : ceph version 0.94.6 —————— Problem 1: Corruption: Next, whenever I run fsck.ext4 -nvf /dev/vda1 on one of the guests I get this: 2fsck 1.42.9 (4-Feb-2014) Warning! /dev/vda1 is mounted. Warning: skipping journal recovery because doing a read-only filesystem check. Pass 1: Checking inodes, blocks, and sizes Deleted inode 1647 has zero dtime. Fix? no Inodes that were part of a corrupted orphan linked list found. Fix? no Inode 133469 was part of the orphaned inode list. IGNORED. Inode 133485 was part of the orphaned inode list. IGNORED. Inode 133490 was part of the orphaned inode list. IGNORED. Inode 133492 was part of the orphaned inode list. IGNORED. Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Free blocks count wrong (8866035, counted=8865735). Fix? no Inode bitmap differences: -1647 -133469 -133485 -133490 -133492 Fix? no Free inodes count wrong (2508840, counted=2509091). Fix? no cloudimg-rootfs: ********** WARNING: Filesystem still has errors ********** 112600 inodes used (4.30%, out of 2621440) 70 non-contiguous files (0.1%) 77 non-contiguous directories (0.1%) # of inodes with ind/dind/tind blocks: 0/0/0 Extent depth histogram: 104372/41 1619469 blocks used (15.44%, out of 10485504) 0 bad blocks 2 large files 89034 regular files 14945 directories 55 character device files 25 block device files 1 fifo 16 links 8265 symbolic links (7832 fast symbolic links) 10 sockets ------------ 112351 files So I mount the disk via RBD on a host directly with rbd map and when i do a fsck.ext4 -nfv /dev/rbd01p1 i get fsck.ext4 /dev/rbd0p1 e2fsck 1.42.11 (09-Jul-2014) cloudimg-rootfs: clean, 112600/2621440 files, 1619469/10485504 blocks So which one do I trust??? I have had corrupted files on some of the images but I accredited this due to a migration from qcow2 to RAW -> ceph. Any help is really appreciated ———— > Problem 2: Performance I would assume that even with the Intel DC SSDs as journals, I would get decent performance out of the system. But currently I max this one out at 200MB/s write while read is full 10Gbit/s I have 10 SATA drives behind the SSDs using 2x 3 SATAs/SSD and 2x 2 SATA / SSD fio is also giving terrible results, its like it cranks up the IO to about 5000 then dwindles down.. looks almost like its waiting to flush the SSDs out.. or the IO The only changes i made to the base config is rbd cache = true and then the following lines: ceph tell osd.* injectargs '--filestore_wbthrottle_enable=false' ceph tell osd.* injectargs '--filestore_queue_max_bytes=1048576000' ceph tell osd.* injectargs '--filestore_queue_committing_max_ops=5000' ceph tell osd.* injectargs '--filestore_queue_committing_max_bytes=1048576000' ceph tell osd.* injectargs '--filestore_queue_max_ops=200' ceph tell osd.* injectargs '--journal_max_write_entries=1000' ceph tell osd.* injectargs '--journal_queue_max_ops=3000’ Thats the only way I reached 200-250MB/s.. otherwise its more like 115MB/s also waiting for flush after a wave.. Can anyone give me a fairly decent idea on how to tune this properly? also could this modification have something to do with the corruption? Thanks again for any help :) //Florian _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com