The max target limit is a hard limit: the OSDs won't let more than that amount of data in the cache tier. They will start flushing and evicting based on the percentage ratios you can set (I don't remember the exact parameter names) and you may need to set these more aggressively for your given workload. The tricky bit with this is that of course the OSDs don't have global knowledge about how much total data is in the cache ? so when you set a 100TB cache that has 1024 PGs, the OSDs are actually applying those limits on a per-PG basis, and not letting any given PG use more than 100/1024 TB. This is probably the heavy read activity you're seeing on one OSD at a time, when it happens to reach the hard limit. :/ The specific blocked ops you're seeing are in various stages and probably just indicative of the OSD doing a bunch of flushing which is blocking other accesses. -Greg On Tue, May 19, 2015 at 12:03 PM, Nick Fisk <nick at fisk.me.uk> wrote: > Been doing some more digging. I'm getting messages in the OSD logs like > these, don't know if these are normal or a clue to something not right > > 2015-05-19 18:36:27.664698 7f58b91dd700 0 log_channel(cluster) log [WRN] : > slow request 30.346117 seconds old, received at 2015-05-19 18:35:57.318208: > osd_repop(client.1205463.0:7612211 6.2f > ec3d412f/rb.0.6e7a9.74b0dc51.0000000be050/head//6 v 2674'1102892) currently > commit_sent > > 2015-05-19 17:50:29.700766 7ff1503db700 0 log_channel(cluster) log [WRN] : > slow request 32.548750 seconds old, received at 2015-05-19 17:49:57.151935: > osd_repop_reply(osd.46.0:2088048 6.64 ondisk, result = 0) currently no flag > points reached > > 2015-05-19 17:47:26.903122 7f296b6fc700 0 log_channel(cluster) log [WRN] : > slow request 30.620519 seconds old, received at 2015-05-19 17:46:56.282504: > osd_op(client.1205463.0:7261972 rb.0.6e7a9.74b0dc51.0000000b7ff9 > [set-alloc-hint object_size 1048576 write_size 1048576,write 258048~131072] > 6.882797bc ack+ondisk+write+known_if_redirected e2674) currently commit_sent > > > > > >> -----Original Message----- >> From: ceph-users [mailto:ceph-users-bounces at lists.ceph.com] On Behalf Of >> Nick Fisk >> Sent: 18 May 2015 17:25 >> To: ceph-users at lists.ceph.com >> Subject: Re: Cache Pool Flush/Eviction Limits - Hard of Soft? >> >> Just to update on this, I've been watching iostat across my Ceph nodes and > I >> can see something slightly puzzling happening and is most likely the cause > of >> the slow (>32s) requests I am getting. >> >> During a client write-only IO stream, I see reads and writes to the cache > tier, >> which is normal as blocks are being promoted/demoted. The latency does >> suffer, but not excessively and is acceptable for data that has fallen out > of >> cache. >> >> However, every now and again it appears that one of the OSD's suddenly > just >> starts aggressively reading and appears to block any IO until that read > has >> finished. Example below where /dev/sdd is a 10K disk in the cache tier. > All >> other nodes have their /dev/sdd devices being completely idle during this >> period. The disks on the base tier seem to be doing writes during this > period, >> so looks related to some sort of flushing. >> >> Device rrqm/s wrqm/s r/s w/s rkB/s wkB/s rq-sz > qu-sz >> await r_wait w_wait svctm util >> sdd 0.00 0.00 471.50 0.00 2680.00 0.00 11.37 0.96 2.03 >> 2.03 0.00 1.90 89.80 >> >> Most of the times I observed this whilst I was watching iostat, the read > only >> lasted around 5-10s, but I suspect that sometimes it is going on for > longer and >> is the cause of the "requests are blocked errors". I have also noticed > that this >> appears to happen more often depending on if there are a greater number >> of blocks to be promoted/demoted. Other pools are not affected during >> these hangs. >> >> From the look of the iostat stats, I would assume that for a 10k disk, it > must >> be doing a sequential read to get that number of IO's. >> >> Does anybody have any clue what might be going on? >> >> Nick >> >> > -----Original Message----- >> > From: ceph-users [mailto:ceph-users-bounces at lists.ceph.com] On Behalf >> > Of Nick Fisk >> > Sent: 30 April 2015 12:53 >> > To: ceph-users at lists.ceph.com >> > Subject: Cache Pool Flush/Eviction Limits - Hard of Soft? >> > >> > Does anyone know if the Flush and Eviction limits are hard limits, ie >> > as >> soon as >> > they are exceeded writes will block, or will the pool only block when >> > it reaches Target_max_bytes? >> > >> > I'm see really poor performance and frequent requests are blocked >> > messages once data starts having to be evicted/flushed and I was just >> > wondering if the above was true. >> > >> > If the limits are soft, I would imagine making high and low target >> > limits >> would >> > help:- >> > >> > Target_dirty_bytes_low=.3 >> > Target_dirty_bytes_high=.4 >> > >> > Once the amount of dirty bytes passes the low limit a very low >> > priority >> flush >> > occurs, if the high limit is reached data is flushed much more >> aggressively. >> > The same could also exist for eviction. This will allow burst of write >> activity to >> > occur before flushing starts heavily impacting performance. >> > >> > Nick >> > >> >> >> >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users at lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com