Hello, On Sun, 5 Feb 2017 02:03:57 +0100 Marc Roos wrote: > > > I have a 3 node test cluster with one osd per node. And I write a file > to a pool with size 1. Why doesn’t ceph just use the full 110MB/s of > the network (as with default rados bench test)? Does ceph 'reserve' > bandwidth for other concurrent connections? Can this be tuned? > Firstly, benchmarking from within the cluster will always give you skewed results, typically better than what a client would see since some actions will be local. Secondly, you're not telling us much about the cluster, but I'll assume these are plain drives with in-line journals. Same about your rados bench test run, so I'll assume defaults there. What you're seeing here is with near certainty the difference between a single client (your "put") and multiple ones (rados bench does 16 threads by default). So rados gets to distribute all writes amongst all OSDs and is able to saturate things, while you put has to wait for a single OSD and the latency of the network. A single SATA HDD typically can do about 150MB/s writes, but that would be sequential ones, which RBD isn't. The journal takes half of that, the FS journals and overhead some more, so about 40MB/s effective performance doesn't suprise me at all. Run this test with atop or iostat active on all machines to confirm. As for your LACP question, it is what it is. With a sufficient number of real clients and OSDs things will get distributed eventually, but a single client will never get more than one link worth. Typically people find that while bandwidth certainly is desirable (up to a point), it is the lower latency of faster links like 10/25/40Gbs Ethernet or Infiniband that makes their clients (VMs) happier. Christian > Putting from ram drive on first node > time rados -p data1 put test3.img test3.img > > --net/eth0----net/eth1- --dsk/sda-----dsk/sdb-----dsk/sdc-- > recv send: recv send| read writ: read writ: read writ > 2440B 4381B:1973B 0 | 0 0 : 0 0 : 0 0 > 1036B 3055B:1958B 124B| 0 0 : 0 0 : 0 0 > 1382B 3277B:1316B 0 | 0 0 : 0 0 : 0 0 > 1227B 2850B:1243B 0 | 0 0 : 0 0 : 0 0 > 1216B 120k:2300B 0 | 0 0 : 0 0 : 0 0 > 1714B 8257k: 15k 0 | 0 4096B: 0 0 : 0 0 > 1006B 24M: 40k 0 | 0 14k: 0 0 : 0 0 > 1589B 36M: 58k 0 | 0 32k: 0 0 : 0 0 > 856B 36M: 57k 0 | 0 0 : 0 0 : 0 0 > 856B 40M: 64k 0 | 0 0 : 0 0 : 0 0 > 2031B 36M: 58k 0 | 0 0 : 0 0 : 0 0 > 865B 36M: 58k 0 | 0 24k: 0 0 : 0 0 > 1713B 39M: 61k 0 | 0 37k: 0 0 : 0 0 > 997B 38M: 59k 0 | 0 0 : 0 0 : 0 0 > 66B 36M: 58k 0 | 0 0 : 0 0 : 0 0 > 1782B 36M: 57k 0 | 0 0 : 0 0 : 0 0 > 931B 36M: 58k 0 | 0 8192B: 0 0 : 0 0 > 931B 36M: 57k 0 | 0 45k: 0 0 : 0 0 > 724B 36M: 57k 0 | 0 0 : 0 0 : 0 0 > 922B 28M: 47k 0 | 0 0 : 0 0 : 0 0 > 2506B 4117B:2261B 0 | 0 0 : 0 0 : 0 0 > 865B 7630B:2631B 0 | 0 15k: 0 0 : 0 0 > > > Goes to 3rd node > > --net/eth0----net/eth1- --dsk/sda-----dsk/sdb-----dsk/sdc-----dsk/sdd-- > recv send: recv send| read writ: read writ: read writ: read writ > 66B 1568B: 733B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 2723B 4979B:1469B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 66B 1761B:1347B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 66B 2480B: 119k 0 | 0 0 : 0 0 : 0 0 : 0 0 > 103k 22k: 12M 0 | 0 4096B: 0 0 : 0 12M: 0 0 > 784B 41k: 24M 0 | 0 17k: 0 0 : 0 24M: 0 0 > 1266B 63k: 38M 0 | 0 40k: 0 0 : 0 37M: 0 0 > 66B 60k: 39M 0 | 0 0 : 0 0 : 0 41M: 0 0 > 66B 60k: 37M 0 | 0 0 : 0 0 : 0 38M: 0 0 > 104k 62k: 38M 0 | 0 0 : 0 0 : 0 38M: 0 0 > 66B 61k: 38M 0 | 0 15k: 0 0 : 0 42M: 0 0 > 1209B 59k: 36M 0 | 0 44k: 0 0 : 0 39M: 0 0 > 66B 60k: 38M 0 | 0 87k: 0 0 : 0 39M: 0 0 > 980B 62k: 38M 0 | 0 0 : 0 0 : 0 41M: 0 0 > 103k 52k: 32M 0 | 0 4096B: 0 0 : 60k 42M: 0 0 > 66B 61k: 38M 0 | 0 8192B: 0 0 : 0 40M: 0 0 > 476B 58k: 36M 0 | 0 45k: 0 0 : 0 41M: 0 0 > 1514B 55k: 34M 0 | 0 0 : 0 0 :8192B 41M: 0 0 > 856B 42k: 24M 0 | 0 0 : 0 0 : 0 28M: 0 0 > 103k 3010B:1681B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 126B 3363B:4187B 0 | 0 15k: 0 0 : 0 0 : 0 0 > 551B 2714B: 22k 0 | 0 40k: 0 0 : 0 0 : 0 0 > 724B 2073B: 919B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 66B 2044B: 118k 0 | 0 0 : 0 0 : 0 0 : 0 0 > > 2nd node does nothing, as expected > > --net/eth0----net/eth1- --dsk/sda-----dsk/sdb-----dsk/sdc-----dsk/sdd-- > recv send: recv send| read writ: read writ: read writ: read writ > 348B 104k: 66B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 4326B 3132B: 294B 0 | 0 14k: 0 0 : 0 0 : 0 0 > 24k 3783B: 66B 0 | 0 40k: 0 0 : 0 0 : 0 0 > 658B 1258B: 66B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 658B 1044B: 190B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 926B 105k: 66B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 4891B 3197B: 66B 0 | 0 27k: 0 0 : 0 0 : 0 0 > 22k 1752B: 66B 0 | 0 44k: 0 0 : 0 0 : 0 0 > 1655B 1925B: 66B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 1440B 1966B: 66B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 710B 104k: 66B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 4186B 2539B: 66B 0 | 0 16k: 0 0 : 0 0 : 0 0 > 22k 1677B: 66B 0 | 0 45k: 0 0 : 0 0 : 0 0 > 658B 1258B: 66B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 2896B 3018B: 66B 124B| 0 0 : 0 0 : 0 0 : 0 0 > 710B 104k: 66B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 4973B 3251B: 66B 0 | 0 23k: 0 0 : 0 0 : 0 0 > 22k 1752B: 66B 0 | 0 41k: 0 0 : 0 0 : 0 0 > 733B 1267B: 66B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 1580B 1900B: 66B 0 | 0 0 : 0 0 : 0 0 : 0 0 > 1368B 105k: 66B 0 | 0 0 : 0 0 : 0 0 : 0 0 > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com