I would say stick with Hammer..Also, hope you are making sure you have sufficiently large number of PGs on the pool you are testing with. -----Original Message----- From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Robert LeBlanc Sent: Saturday, June 06, 2015 11:06 PM To: Somnath Roy Cc: Dałek, Piotr; ceph-devel Subject: Re: Looking to improve small I/O performance -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 This is the test that we are running that simulates the workload size and ratios of our typical servers. Of course we are not doing direct I/O on our real workloads and the threads and jobs is arbitrary, but it gives us an idea of the performance we could expect. fio --name fio_test --rw randrw --bssplit 4k/85:32k/11:512/3:1m/1,4k/89:32k/10:512k/1 --ioengine libaio --iodepth 8 --numjobs 8 --direct 1 --rwmixread 72 --norandommap --minimal --size=2500MB --runtime=300 --time_based --thread Right now we are spindle with SSD journals. We have not configured SSD tiering, but I'm not sure it will help much given the low ratio of SSD to spindle. I'll do some tests on our dev cluster and see what I come up with. Would doing this test be pretty pointless with 0.94.1, should I be looking at master because there are significant improvements? Thanks, -----BEGIN PGP SIGNATURE----- Version: Mailvelope v0.13.1 Comment: https://www.mailvelope.com wsFcBAEBCAAQBQJVc99ZCRDmVDuy+mK58QAALJ8P/jDohnzb6PB+OxHSjwqx MPBuhdgkEzFa3s0Kk7PGR/ZKQUg+KR6nAWy8Ja8Ee0ToMN9vOjZOXtKcdI3a EEZh3BdmyArjOvIdZiUP9Dkl9wQhkJgDnA/Ssg1s+3KkamFh7MfS+rzj0KiX ivg+0SZos3Y3J1Lt1MFbbCRAbjMquHhDOe1YgPg+uETJlOkRjtgxTMN8FCWI VZYWn3uzSeMOoGGLMfE6H2p4WMMTPQdTEYz1ofi2hYC9GbAnUdk8hHUj5Np7 kqA9nJUZ+TIMS3sFg62Vy01iBlk//TUJIexanprzRsjQT03sju/QIcsAqksy Cvtc+VF8HJgr9eWryyVS1k5tpmP6bk+l3v1zIjBugTgxFj0p3frizhVXmz2s O00bsgenrdrzu/W7lK43V2Jhrt8UM2qVfZfJKYfpo+VGLvA/FSxYfpPJKbd3 gS2oE4ldRDjg7k0+GafO4bONBo9MmSmwRzjiBl/RScqn1GzX2mqaaokr12+E H9ZJ4B8u3UkI+VyLdRCOYxTuPtnByvOlkIQRQvG/p2Au8fQPuzV0iB2d1Zu8 BEcI2LpDxzzsXHI6HXnMqEkXtK/XoEhYDc4RL2cdByQocsqCUZU4RKWKoUEJ wOKDGCh66g5BynnvoRo6dx/NcN5z10A1YkIP0v0rBgK+X80dZ/yZMr3jB5o6 7Ghj =p3Iq -----END PGP SIGNATURE----- ---------------- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Sat, Jun 6, 2015 at 10:44 AM, Somnath Roy <Somnath.Roy@xxxxxxxxxxx> wrote: > Robert, > You can try the following config option to enable asyn messenger. > > ms_type = async > enable_experimental_unrecoverable_data_corrupting_features = > ms-type-async > > BTW, what kind of workload you are trying , random read or write ? > Also, is this SSD or HDD cluster ? > > Thanks & Regards > Somnath > -----Original Message----- > From: ceph-devel-owner@xxxxxxxxxxxxxxx > [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Robert LeBlanc > Sent: Saturday, June 06, 2015 8:06 AM > To: Dałek, Piotr > Cc: ceph-devel > Subject: Re: Looking to improve small I/O performance > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > I found similar results in my testing as well. Ceph is certainly great at large I/O, but our workloads are in the small I/O range. I understand that latency plays a huge role when I/O is small. Since we have 10 and 40 Gb Ethernet, we can't get much lower in latency that way (Infiniband is not an option right now). So I was poking around to see if there was some code optimizations that might reduce latency (not that I'm smart enough to do the coding myself). > > I was surprised when enabling QEMU writeback cache and set the cache to the working set size, I really didn't get any additional performance. After the first run the QEMU process allocated almost all the memory. I believe after several runs, there was some improvement, but not what I expected. > > What is the status of the async messenger, it looks like it is experimental in the code? How do you enable it, I can't seem to find a config option, does Ceph have to be compiled with it? I would like to test it on my dev cluster. > > Thanks, > -----BEGIN PGP SIGNATURE----- > Version: Mailvelope v0.13.1 > Comment: https://www.mailvelope.com > > wsFcBAEBCAAQBQJVcwxfCRDmVDuy+mK58QAAIJ0P/iz05KKuNw1Ypk3xsg/v > 7MzrSw70+RZMJd4qOs8OFrBC+IiX1KBOlgrrtAjKRygWgYgK3Aqzw5DEu1RN > 2tJiGai9e5Vch/wl+OHhP7S07Q2eN7fJJS+OFtA481XBNeFGhdywhOYenJjk > RcDSJcVPgcrPB5SI90UqycwxLjH+XBotFHycuwyHj4LqkHXf4tM4Nbi4A1RV > xOhVPQxWlaregwOaS8b8kwFUzkLQic1mMNgSMizpSiPnLuXUnfI7pjtvjOYU > ld6QmZgu+xKC/qIJm8ToOJUVD3IkSbpv8Ngs73K12h/3C8mj4+uY4qJWouG4 > RU3sFMfKgVeNDPSIsjO7Zy9s5/lp64RqPcblj72+3yYC+YJ4ZhLAwRyhtSvO > VXkLheZRtMemWbrOCQKinWAlH+m0dwAHv816oFFvkFdOYl/xmmiTo9ctNBqC > MVK9tm01DRqA23MFFNQ25lvHzFv3zZ7aPWLeqRin8F7dddwBauva/J7GyFC0 > bk0mPi83++LQt3r+PUMYCOS+aG+0f8oM8/uValUfEGr4+pcjyI/dZk1k0Q6c > cImb2cmy16OgrfzN7isYt7z37dUlQT/2rC74LvTscIIdf1dZQHWwXHelRm49 > y1pxq07V7LlL6gM+zA6Zskm9QwlJ3D81mH7QpiaixKX8cEzcVifD7WUzP/YV > Go8K > =gB4o > -----END PGP SIGNATURE----- > ---------------- > Robert LeBlanc > GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Sat, Jun 6, 2015 at 12:07 AM, Dałek, Piotr <Piotr.Dalek@xxxxxxxxxxxxxx> wrote: >>> -----Original Message----- >>> From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel- >>> >>> I'm digging into perf and the code to see here/how I might be able >>> to improve performance for small I/O around 16K. >>> >>> I ran fio with rados and used perf to record. Looking through the >>> report, there is a very substantial amount of time creating threads >>> (or so it looks, but I'm really new to perf). It seems to point to >>> messenger, so I looked in the code. From perf if looks like thread >>> pooling isn't happening, but from what I can gather from the code, it should. >>> [..] >> >> This is so because you use SimpleMessenger, which can't handle small I/O well. >> Indeed, threads are problematic with it, as well as memory allocation. >> I did some benchmarking some time ago and the gist of it is that you >> could try going for AsyncMessenger and see if it helps. You can also see my results here: >> http://stuff.predictor.org.pl/chunksize.xlsx >> From there you can see that most of the time of small I/Os in >> SimpleMessenger Is spent in tcmalloc code, and also there's a >> performance drop around 64k Blocksize in Async Messenger. >> >> With best regards / Pozdrawiam >> Piotr Dałek >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo > info at http://vger.kernel.org/majordomo-info.html > > ________________________________ > > PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f