On Thu, 4 Sep 2014, Ilya Dryomov wrote: > On Thu, Sep 4, 2014 at 3:39 PM, Chaitanya Huilgol > <Chaitanya.Huilgol@xxxxxxxxxxx> wrote: > > Hi, > > > > In our benchmarking tests we observed that the ms_tcp_nodelay in ceph.conf option is not affecting the kernel rbd and as expected we see poor latency numbers for lower queue depths and 4K rand reads. There is significant increase in latency from qd=2 to 24 and starts tapering down for higher queue depths. > > We did not find relevant kernel_setsockopt with TCP_NODELAY in the kernel RBD/libceph (messenger.c) source. Unless we are missing something, looks like currently the kernel RBD is not setting this and this is affecting latency numbers are lower queue depths. > > > > I have tested with userspace fio(rbd engine) and rados bench and we see similar latency behavior when ms_tcp_nodelay is set to false. However setting this to true gives consistent low latency numbers for all queue depths > > > > Any ideas/thoughts on this? > > > > OS Ubuntu 14.04 > > Kernel: 3.13.0-24-generic #46-Ubuntu SMP > > Ceph: Latest Master > > No, we don't set TCP_NODELAY in the kernel client, but I think we can > add it as a rbd map/mount option. Sage? We definitely can, and I think more importantly it should be on by default, as it is in userspace. I'm surpised we missed that. :( IIRC we are carefully setting the MORE (or CORK?) flag on all but the last write for a message, but I take it there is a socket-level option we missed? sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html