On Thu, Jul 21, 2016 at 10:55 AM, Fabiano de O. Lucchese <flucchese@xxxxxxxxx> wrote: > Hey, guys. > > I'm still feeling unlucky about these experiments. Here's what I did: > > 1) Set the parameters described below in ceph.conf > 2) Push the ceph.conf to all nodes using ceph-deploy > 3) Restart monitor, mds and osd’s on all nodes > 4) Ran the test twice atleast and look at the results which came out > the second time. > 5) I mounted the filesystem using mount –t ceph 10.76.38.57:/ > /mnt/mycephfs –o name=admin,secret=xxxxxxxxx > 6) Ran the benchmark > > I modified the following parameters and ran each test separately > > - “osd journal size” to 5 GB and 10 GB and 20 GB > - “osd client message size cap” to 0,1,1024 Well, that's a pretty disastrous one. 0 is unlimited, but otherwise you're restricting the OSD to having 1 or 1024 bytes of in-flight client message at once. The client_oc* params I mentioned should help even things out without serializing it so badly, although as a client-side thing it only applies to ceph-fuse. I'm not sure what to do about kernel mounts. -Greg > - “osd pool default min size” to 1 and 3 > > > In all the above I observed similar pattern as below. > - about 5-6 Gbps throughput at the start of the test and gradually > dropping to 900 Mbps till the test completes. I also observed that post > 150-160 files being written there is a wait for about 10-15 seconds before > the next file is written. > > Test using FUSE. > I also installed FUSE on cephnode1 and mounted the fuse mount using the > following command > ceph-fuse –m 10.76.38.56 /mnt/mycephfs > > I saw a drastic reduction in write throughput to around 170 Mbps. The system > took about 5-6 seconds before it started writing any files and was > constantly at 150 – 180 Mbps write through put when the directory was > mounted using FUSE. > > Any additional thoughts? Would the problem be due to my NFS client? > > Regards, > > F. > > ________________________________ > From: Gregory Farnum <gfarnum@xxxxxxxxxx> > To: Patrick Donnelly <pdonnell@xxxxxxxxxx> > Cc: Fabiano de O. Lucchese <flucchese@xxxxxxxxx>; > "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx> > Sent: Tuesday, July 19, 2016 5:23 PM > Subject: Re: CephFS write performance > > On Tue, Jul 19, 2016 at 9:39 AM, Patrick Donnelly <pdonnell@xxxxxxxxxx> > wrote: >> On Tue, Jul 19, 2016 at 10:25 AM, Fabiano de O. Lucchese >> <flucchese@xxxxxxxxx> wrote: >>> I configured the cluster to replicate data twice (3 copies), so these >>> numbers fall within my expectations. So far so good, but here's comes the >>> issue: I configured CephFS and mounted a share locally on one of my >>> servers. >>> When I write data to it, it shows abnormally high performance at the >>> beginning for about 5 seconds, stalls for about 20 seconds and then picks >>> up >>> again. For long running tests, the observed write throughput is very >>> close >>> to what the rados bench provided (about 640 MB/s), but for short-lived >>> tests, I get peak performances of over 5GB/s. I know that journaling is >>> expected to cause spiky performance patters like that, but not to this >>> level, which makes me think that CephFS is buffering my writes and >>> returning >>> the control back to client before persisting them to the jounal, which >>> looks >>> undesirable. >> >> The client is buffering the writes to RADOS which would give you the >> abnormally high initial performance until the cache needs flushed. You >> might try tweaking certain osd settings: >> >> http://docs.ceph.com/docs/hammer/rados/configuration/osd-config-ref/ >> >> in particular: "osd client message size cap". Also: > > I am reasonably sure you don't want to change the message size cap; > that's entirely an OSD-side throttle about how much dirty data it lets > in before it stops reading off the wire — and I don't think the client > feeds back from outgoing data. More likely it's about how much dirty > data is being absorbed by the Client before it forces writes out to > the OSDs and you want to look at > > client_oc_size (default 1024*1024*200, aka 200MB) > client_oc_max_dirty (default 100MB) > client_oc_target_dirty (default 8MB) > > and turn down the max dirty limits if you're finding it's too bumpy a ride. > -Greg > > >> >> http://docs.ceph.com/docs/hammer/rados/configuration/journal-ref/ >> >> -- >> Patrick Donnelly > >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com