Hello, On Mon, 11 Jul 2016 07:35:02 +0300 K K wrote: > > Hello, guys > > I to face a task poor performance into windows 2k12r2 instance running > on rbd (openstack cluster). RBD disk have a size 17Tb. My ceph cluster > consist from: > - 3 monitors nodes (Celeron G530/6Gb RAM, DualCore E6500/2Gb RAM, > Core2Duo E7500/2Gb RAM). Each node have 1Gbit network to frontend subnet > od Ceph cluster I hope the fastest of these MONs (CPU and storage) has the lowest IP number and thus is the leader. Also what Ceph, OS, kernel version? > - 2 block nodes (Xeon E5620/32Gb RAM/2*1Gbit NIC). Each node have > 2*500Gb HDD for operation system and 9*3Tb SATA HDD (WD SE). Total 18 > OSD daemons on 2 nodes. Two GbE ports, given the "frontend" up there with the MON description I assume that's 1 port per client (front) and cluster (back) network? >Journals placed on same HDD as a rados data. I > know that better using for those purpose separate SSD disk. Indeed... >When I test > new windows instance performance was good (read/write something about > 100Mb/sec). But after I copied 16Tb data to windows instance read > performance has down to 10Mb/sec. Type of data on VM - image and video. > 100MB/s would be absolute perfect with the setup you have, assuming no contention (other clients). Is there any other client on than that Windows VM on your Ceph cluster? > ceph.conf on client side: > [global] > auth cluster required = cephx > auth service required = cephx > auth client required = cephx > filestore xattr use omap = true > filestore max sync interval = 10 > filestore queue max ops = 3000 > filestore queue commiting max bytes = 1048576000 > filestore queue commiting max ops = 5000 > filestore queue max bytes = 1048576000 > filestore queue committing max ops = 4096 > filestore queue committing max bytes = 16 MiB ^^^ Is Ceph understanding this now? Other than that, the queue options aren't likely to do much good with pure HDD OSDs. > filestore op threads = 20 > filestore flusher = false > filestore journal parallel = false > filestore journal writeahead = true > journal dio = true > journal aio = true > journal force aio = true > journal block align = true > journal max write bytes = 1048576000 > journal_discard = true > osd pool default size = 2 # Write an object n times. > osd pool default min size = 1 > osd pool default pg num = 333 > osd pool default pgp num = 333 That should be 512, 1024 really with one RBD pool. http://ceph.com/pgcalc/ > osd crush chooseleaf type = 1 > > [client] > rbd cache = true > rbd cache size = 67108864 > rbd cache max dirty = 50331648 > rbd cache target dirty = 33554432 > rbd cache max dirty age = 2 > rbd cache writethrough until flush = true > > > rados bench show from block node show: Wrong way to test this, test it from a monitor node, another client node (like your openstack nodes). In your 2 node cluster half of the reads or writes will be local, very much skewing your results. > rados bench -p scbench 120 write --no-cleanup Default tests with 4MB "blocks", what are the writes or reads from you client VM like? > Total time run: 120.399337 > Total writes made: 3538 > Write size: 4194304 > Object size: 4194304 > Bandwidth (MB/sec): 117.542 > Stddev Bandwidth: 9.31244 > Max bandwidth (MB/sec): 148 ^^^ That wouldn't be possible from an external client. > Min bandwidth (MB/sec): 92 > Average IOPS: 29 > Stddev IOPS: 2 > Max IOPS: 37 > Min IOPS: 23 > Average Latency(s): 0.544365 > Stddev Latency(s): 0.35825 > Max latency(s): 5.42548 Very high max latency, telling us that your cluster ran out of steam at some point. > Min latency(s): 0.101533 > > rados bench -p scbench 120 seq > Total time run: 120.880920 > Total reads made: 1932 > Read size: 4194304 > Object size: 4194304 > Bandwidth (MB/sec): 63.9307 > Average IOPS 15 > Stddev IOPS: 3 > Max IOPS: 25 > Min IOPS: 5 > Average Latency(s): 0.999095 > Max latency(s): 8.50774 > Min latency(s): 0.0391591 > > rados bench -p scbench 120 rand > Total time run: 121.059005 > Total reads made: 1920 > Read size: 4194304 > Object size: 4194304 > Bandwidth (MB/sec): 63.4401 > Average IOPS: 15 > Stddev IOPS: 4 > Max IOPS: 26 > Min IOPS: 1 > Average Latency(s): 1.00785 > Max latency(s): 6.48138 > Min latency(s): 0.038925 > > On XFS partitions fragmentation no more than 1% I'd de-frag anyway, just to rule that out. When doing your tests or normal (busy) operations from the client VM, run atop on your storage nodes and observe your OSD HDDs. Do they get busy, around 100%? Check with iperf or NPtcp that your network to the clients from the storage nodes is fully functional. Christian -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com