Okay, well, let's try and track some of these down. What's the content of the "ceph.layout" xattr on the directory you're running this test in? Can you verify that pool 0 is the data pool used by CephFS, and that all reported slow ops are in that pool? Can you record the IO patterns on an OSD while this test is being run and see what it does? (I'm wondering if none of the CephFS pools are in the page cache due to lack of use, and it's seeking all over trying to find them once the test starts.) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Feb 24, 2014 at 11:54 PM, Dan van der Ster <daniel.vanderster@xxxxxxx> wrote: > It's really bizarre, since we can easily pump ~1GB/s into the cluster with > rados bench from a single 10Gig-E client. We only observe this with kernel > CephFS on that host -- which is why our original theory something like this: > - client caches 4GB of writes > - client starts many opening IOs in parallel to flush that cache > - each individual 4MB write is taking longer than 30s to send from the > client to the OSD, due to the 1Gig-E network interface on the client. > > But in that we assume quite a lot about the implementations of librados and > the osd. But something like this would also explain why only the cephfs > writes are becoming slow -- the 2kHz of other (mostly RBD) IOs are not > affected by this "overload". > > Cheers, Dan _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com