Been doing some more digging. I'm getting messages in the OSD logs like these, don't know if these are normal or a clue to something not right 2015-05-19 18:36:27.664698 7f58b91dd700 0 log_channel(cluster) log [WRN] : slow request 30.346117 seconds old, received at 2015-05-19 18:35:57.318208: osd_repop(client.1205463.0:7612211 6.2f ec3d412f/rb.0.6e7a9.74b0dc51.0000000be050/head//6 v 2674'1102892) currently commit_sent 2015-05-19 17:50:29.700766 7ff1503db700 0 log_channel(cluster) log [WRN] : slow request 32.548750 seconds old, received at 2015-05-19 17:49:57.151935: osd_repop_reply(osd.46.0:2088048 6.64 ondisk, result = 0) currently no flag points reached 2015-05-19 17:47:26.903122 7f296b6fc700 0 log_channel(cluster) log [WRN] : slow request 30.620519 seconds old, received at 2015-05-19 17:46:56.282504: osd_op(client.1205463.0:7261972 rb.0.6e7a9.74b0dc51.0000000b7ff9 [set-alloc-hint object_size 1048576 write_size 1048576,write 258048~131072] 6.882797bc ack+ondisk+write+known_if_redirected e2674) currently commit_sent > -----Original Message----- > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of > Nick Fisk > Sent: 18 May 2015 17:25 > To: ceph-users@xxxxxxxxxxxxxx > Subject: Re: Cache Pool Flush/Eviction Limits - Hard of Soft? > > Just to update on this, I've been watching iostat across my Ceph nodes and I > can see something slightly puzzling happening and is most likely the cause of > the slow (>32s) requests I am getting. > > During a client write-only IO stream, I see reads and writes to the cache tier, > which is normal as blocks are being promoted/demoted. The latency does > suffer, but not excessively and is acceptable for data that has fallen out of > cache. > > However, every now and again it appears that one of the OSD's suddenly just > starts aggressively reading and appears to block any IO until that read has > finished. Example below where /dev/sdd is a 10K disk in the cache tier. All > other nodes have their /dev/sdd devices being completely idle during this > period. The disks on the base tier seem to be doing writes during this period, > so looks related to some sort of flushing. > > Device rrqm/s wrqm/s r/s w/s rkB/s wkB/s rq-sz qu-sz > await r_wait w_wait svctm util > sdd 0.00 0.00 471.50 0.00 2680.00 0.00 11.37 0.96 2.03 > 2.03 0.00 1.90 89.80 > > Most of the times I observed this whilst I was watching iostat, the read only > lasted around 5-10s, but I suspect that sometimes it is going on for longer and > is the cause of the "requests are blocked errors". I have also noticed that this > appears to happen more often depending on if there are a greater number > of blocks to be promoted/demoted. Other pools are not affected during > these hangs. > > From the look of the iostat stats, I would assume that for a 10k disk, it must > be doing a sequential read to get that number of IO's. > > Does anybody have any clue what might be going on? > > Nick > > > -----Original Message----- > > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf > > Of Nick Fisk > > Sent: 30 April 2015 12:53 > > To: ceph-users@xxxxxxxxxxxxxx > > Subject: Cache Pool Flush/Eviction Limits - Hard of Soft? > > > > Does anyone know if the Flush and Eviction limits are hard limits, ie > > as > soon as > > they are exceeded writes will block, or will the pool only block when > > it reaches Target_max_bytes? > > > > I'm see really poor performance and frequent requests are blocked > > messages once data starts having to be evicted/flushed and I was just > > wondering if the above was true. > > > > If the limits are soft, I would imagine making high and low target > > limits > would > > help:- > > > > Target_dirty_bytes_low=.3 > > Target_dirty_bytes_high=.4 > > > > Once the amount of dirty bytes passes the low limit a very low > > priority > flush > > occurs, if the high limit is reached data is flushed much more > aggressively. > > The same could also exist for eviction. This will allow burst of write > activity to > > occur before flushing starts heavily impacting performance. > > > > Nick > > > > > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com