Just to update on this, I've been watching iostat across my Ceph nodes and I can see something slightly puzzling happening and is most likely the cause of the slow (>32s) requests I am getting. During a client write-only IO stream, I see reads and writes to the cache tier, which is normal as blocks are being promoted/demoted. The latency does suffer, but not excessively and is acceptable for data that has fallen out of cache. However, every now and again it appears that one of the OSD's suddenly just starts aggressively reading and appears to block any IO until that read has finished. Example below where /dev/sdd is a 10K disk in the cache tier. All other nodes have their /dev/sdd devices being completely idle during this period. The disks on the base tier seem to be doing writes during this period, so looks related to some sort of flushing. Device rrqm/s wrqm/s r/s w/s rkB/s wkB/s rq-sz qu-sz await r_wait w_wait svctm util sdd 0.00 0.00 471.50 0.00 2680.00 0.00 11.37 0.96 2.03 2.03 0.00 1.90 89.80 Most of the times I observed this whilst I was watching iostat, the read only lasted around 5-10s, but I suspect that sometimes it is going on for longer and is the cause of the "requests are blocked errors". I have also noticed that this appears to happen more often depending on if there are a greater number of blocks to be promoted/demoted. Other pools are not affected during these hangs. >From the look of the iostat stats, I would assume that for a 10k disk, it must be doing a sequential read to get that number of IO's. Does anybody have any clue what might be going on? Nick > -----Original Message----- > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of > Nick Fisk > Sent: 30 April 2015 12:53 > To: ceph-users@xxxxxxxxxxxxxx > Subject: Cache Pool Flush/Eviction Limits - Hard of Soft? > > Does anyone know if the Flush and Eviction limits are hard limits, ie as soon as > they are exceeded writes will block, or will the pool only block when it > reaches Target_max_bytes? > > I'm see really poor performance and frequent requests are blocked > messages once data starts having to be evicted/flushed and I was just > wondering if the above was true. > > If the limits are soft, I would imagine making high and low target limits would > help:- > > Target_dirty_bytes_low=.3 > Target_dirty_bytes_high=.4 > > Once the amount of dirty bytes passes the low limit a very low priority flush > occurs, if the high limit is reached data is flushed much more aggressively. > The same could also exist for eviction. This will allow burst of write activity to > occur before flushing starts heavily impacting performance. > > Nick > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com