On 2019-07-15T19:08:26, " Drobyshevskiy, Vladimir " <vlad@xxxxxxxxxx> wrote: Hi Vladimir, is this a replicated or EC pool? > The cluster itself: > nautilus > 6 nodes, 7 SSD with 2 OSDs per SSD (14 OSDs in overall). You mean 14 OSDs per node, right? > Each node: 2x Intel Xeon E5-2665 v1 (governor = performance, powersaving > disabled), 64GB RAM, Samsung SM863 1.92TB SSD, QDR Infiniband. I assume that's the cluster backend. How are the clients connected? > I've tried to make an RAID0 with mdraid and 2 virtual drives but haven't > noticed any difference. Your problem isn't bandwidth - it's the commit latency for the small IO. In your enviroment, that's primarily going to be governed by network (and possibly ceph-osd CPU) latency. That doesn't show up as high utilization anywhere, because it's mainly waiting. Most networking is terrifyingly slow compared to the latency of a local flash storage device. And with Ceph, you've got to add at least two roundtrips to every IO (client - primary OSD, primary OSD - replicas, probably more, and if you us EC with ec_overwrites, definitely more roundtrips). Regards, Lars -- SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg) "Architects should open possibilities and not determine everything." (Ueli Zbinden) _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com