On Sat, Apr 25, 2015 at 11:36 PM, Josef Johansson <josef86@xxxxxxxxx> wrote: > Hi, > > With inspiration from all the other performance threads going on here, I started to investigate on my own as well. > > I’m seeing a lot iowait on the OSD, and the journal utilised at 2-7%, with about 8-30MB/s (mostly around 8MB/s write). This is a dumpling cluster. The goal here is to increase the utilisation to maybe 50%. I'm confused. You've got regular hard drives backing your system, so in the long run you aren't going to be able to do much better than those hard drives can do. The SSDs are much faster, so of course they're not getting a load that counts as heavy for them. None of the tuning you discuss below is going to do much except perhaps give the steady state a longer startup time. -Greg > > Journals: Intel DC S3700, OSD: HGST 4TB > > I did some initial testing to make the wbthrottle have more in the buffer, and I think I managed to do it, didn’t affect the journal utilisation though. > > There’s 12 cores for the 10 OSDs per machine to utilise, and they use about 20% of them, so I guess no bottle neck there. > > Well that’s the problem, I really can’t see any bottleneck with the current layout, maybe it’s out copper 10Gb that’s giving us too much latency? > > It would be fancy with some kind of bottle-neck troubleshoot in ceph docs :) > I’m guessing I’m not the only one on these kinds of specs and would be interesting to see if there’s optimisation to be done. > > Hope you guys have a nice weekend :) > > Cheers, > Josef > > Ping from a host to OSD: > > 6 packets transmitted, 6 received, 0% packet loss, time 4998ms > rtt min/avg/max/mdev = 0.063/0.107/0.193/0.048 ms > > Setting on the OSD > > { "filestore_wbthrottle_xfs_ios_start_flusher": "5000"} > { "filestore_wbthrottle_xfs_inodes_start_flusher": "5000"} > { "filestore_wbthrottle_xfs_ios_hard_limit": "10000"} > { "filestore_wbthrottle_xfs_inodes_hard_limit": "10000"} > { "filestore_max_sync_interval": "30”} > > From the standard > > { "filestore_wbthrottle_xfs_ios_start_flusher": "500"} > { "filestore_wbthrottle_xfs_inodes_start_flusher": "500"} > { "filestore_wbthrottle_xfs_ios_hard_limit": “5000"} > { "filestore_wbthrottle_xfs_inodes_hard_limit": “5000"} > { "filestore_max_sync_interval": “5”} > > > a single dump_historic_ops > > { "description": "osd_op(client.47765822.0:99270434 rbd_data.1da982c2eb141f2.0000000000005825 [stat,write 2093056~8192] 3.8130048c e19290)", > "rmw_flags": 6, > "received_at": "2015-04-26 08:24:03.226255", > "age": "87.026653", > "duration": "0.801927", > "flag_point": "commit sent; apply or cleanup", > "client_info": { "client": "client.47765822", > "tid": 99270434}, > "events": [ > { "time": "2015-04-26 08:24:03.226329", > "event": "waiting_for_osdmap"}, > { "time": "2015-04-26 08:24:03.230921", > "event": "reached_pg"}, > { "time": "2015-04-26 08:24:03.230928", > "event": "started"}, > { "time": "2015-04-26 08:24:03.230931", > "event": "started"}, > { "time": "2015-04-26 08:24:03.231791", > "event": "waiting for subops from [22,48]"}, > { "time": "2015-04-26 08:24:03.231813", > "event": "commit_queued_for_journal_write"}, > { "time": "2015-04-26 08:24:03.231849", > "event": "write_thread_in_journal_buffer"}, > { "time": "2015-04-26 08:24:03.232075", > "event": "journaled_completion_queued"}, > { "time": "2015-04-26 08:24:03.232492", > "event": "op_commit"}, > { "time": "2015-04-26 08:24:03.233134", > "event": "sub_op_commit_rec"}, > { "time": "2015-04-26 08:24:03.233183", > "event": "op_applied"}, > { "time": "2015-04-26 08:24:04.028167", > "event": "sub_op_commit_rec"}, > { "time": "2015-04-26 08:24:04.028174", > "event": "commit_sent"}, > { "time": "2015-04-26 08:24:04.028182", > "event": "done"}]}, > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com