Sorry about the repost from the cbt list, but it was suggested I post here as well: I am attempting to track down some performance issues in a Ceph cluster recently deployed. Our configuration is as follows: 3 storage nodes, each with: - 8 Cores - 64GB of RAM - 2x 1TB 7200 RPM Spindle - 1x 120GB Intel SSD - 2x 10GBit NICs (In LACP Port-channel) The OSD pool min_size is set to “1” and “size” is set to “3”. When creating a new pool and running RADOS benchmarks, performance isn’t bad — about what I would expect from this hardware configuration: WRITES: Total writes made: 207 Write size: 4194304 Bandwidth (MB/sec): 80.017 Stddev Bandwidth: 34.9212 Max bandwidth (MB/sec): 120 Min bandwidth (MB/sec): 0 Average Latency: 0.797667 Stddev Latency: 0.313188 Max latency: 1.72237 Min latency: 0.253286 RAND READS: Total time run: 10.127990 Total reads made: 1263 Read size: 4194304 Bandwidth (MB/sec): 498.816 Average Latency: 0.127821 Max latency: 0.464181 Min latency: 0.0220425 This all looks fine, until we try to use the cluster for its purpose, which is to house images for qemu-kvm, which are access using librbd. I/O inside VMs have excessive I/O wait times (in the hundreds of ms at times, making some operating systems, like Windows unusable) and throughput struggles to exceed 10MB/s (or less). Looking at ceph health, we see very low op/s numbers as well as throughput and the requests blocked number seems very high. Any ideas as to what to look at here? health HEALTH_WARN 8 requests are blocked > 32 sec monmap e3: 3 mons at {storage-1=10.0.0.1:6789/0,storage-2=10.0.0.2:6789/0,storage-3=10.0.0.3:6789/0} election epoch 128, quorum 0,1,2 storage-1,storage-2,storage-3 osdmap e69615: 6 osds: 6 up, 6 in pgmap v3148541: 224 pgs, 1 pools, 819 GB data, 227 kobjects 2726 GB used, 2844 GB / 5571 GB avail 224 active+clean client io 3957 B/s rd, 3494 kB/s wr, 30 op/s Of note, on the other list, I was asked to provide the following: - ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff) - The SSD is split into 8GB partitions. These 8GB partitions are used as journal devices, specified in /etc/ceph/ceph.conf. For example: [osd.0] host = storage-1 osd journal = /dev/mapper/INTEL_SSDSC2BB120G4_CVWL4363006R120LGNp1 - rbd_cache is enabled and qemu cache is set to “writeback" - rbd_concurrent_management_ops is unset, so it appears the default is “10” Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service-Disabled Veteran-Owned Business 1775 Wiehle Avenue Suite 101 | Reston, VA 20190 c: 228-547-8045 f: 571-266-3106 DHS EAGLE II Prime Contractor: FC1 SDVOSB Track GSA Schedule 70 SDVOSB: GS-35F-0646S GSA MOBIS Schedule: GS-10F-0404Y ISO 20000 / ISO 27001 Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, copy, use, disclosure, or distribution is STRICTLY prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com