Re: Cache Pool Flush/Eviction Limits - Hard of Soft?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Been doing some more digging. I'm getting messages in the OSD logs like
these, don't know if these are normal or a clue to something not right

2015-05-19 18:36:27.664698 7f58b91dd700  0 log_channel(cluster) log [WRN] :
slow request 30.346117 seconds old, received at 2015-05-19 18:35:57.318208:
osd_repop(client.1205463.0:7612211 6.2f
ec3d412f/rb.0.6e7a9.74b0dc51.0000000be050/head//6 v 2674'1102892) currently
commit_sent

2015-05-19 17:50:29.700766 7ff1503db700  0 log_channel(cluster) log [WRN] :
slow request 32.548750 seconds old, received at 2015-05-19 17:49:57.151935:
osd_repop_reply(osd.46.0:2088048 6.64 ondisk, result = 0) currently no flag
points reached

2015-05-19 17:47:26.903122 7f296b6fc700  0 log_channel(cluster) log [WRN] :
slow request 30.620519 seconds old, received at 2015-05-19 17:46:56.282504:
osd_op(client.1205463.0:7261972 rb.0.6e7a9.74b0dc51.0000000b7ff9
[set-alloc-hint object_size 1048576 write_size 1048576,write 258048~131072]
6.882797bc ack+ondisk+write+known_if_redirected e2674) currently commit_sent





> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
> Nick Fisk
> Sent: 18 May 2015 17:25
> To: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  Cache Pool Flush/Eviction Limits - Hard of Soft?
> 
> Just to update on this, I've been watching iostat across my Ceph nodes and
I
> can see something slightly puzzling happening and is most likely the cause
of
> the slow (>32s) requests I am getting.
> 
> During a client write-only IO stream, I see reads and writes to the cache
tier,
> which is normal as blocks are being promoted/demoted. The latency does
> suffer, but not excessively and is acceptable for data that has fallen out
of
> cache.
> 
> However, every now and again it appears that one of the OSD's suddenly
just
> starts aggressively reading and appears to block any IO until that read
has
> finished. Example below where /dev/sdd is a 10K disk in the cache tier.
All
> other nodes have their /dev/sdd devices being completely idle during this
> period. The disks on the base tier seem to be doing writes during this
period,
> so looks related to some sort of flushing.
> 
> Device	rrqm/s	wrqm/s r/s	w/s	rkB/s	wkB/s	rq-sz
qu-sz
> await	r_wait	w_wait	svctm	util
> sdd	0.00	0.00	471.50	0.00	2680.00	 0.00	11.37	0.96	2.03
> 2.03	0.00	1.90	89.80
> 
> Most of the times I observed this whilst I was watching iostat, the read
only
> lasted around 5-10s, but I suspect that sometimes it is going on for
longer and
> is the cause of the "requests are blocked errors". I have also noticed
that this
> appears to happen more often depending on if there are a greater number
> of blocks to be promoted/demoted. Other pools are not affected during
> these hangs.
> 
> From the look of the iostat stats, I would assume that for a 10k disk, it
must
> be doing a sequential read to get that number of IO's.
> 
> Does anybody have any clue what might be going on?
> 
> Nick
> 
> > -----Original Message-----
> > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf
> > Of Nick Fisk
> > Sent: 30 April 2015 12:53
> > To: ceph-users@xxxxxxxxxxxxxx
> > Subject:  Cache Pool Flush/Eviction Limits - Hard of Soft?
> >
> > Does anyone know if the Flush and Eviction limits are hard limits, ie
> > as
> soon as
> > they are exceeded writes will block, or will the pool only block when
> > it reaches Target_max_bytes?
> >
> > I'm see really poor performance and frequent requests are blocked
> > messages once data starts having to be evicted/flushed and I was just
> > wondering if the above was true.
> >
> > If the limits are soft, I would imagine making high and low target
> > limits
> would
> > help:-
> >
> > Target_dirty_bytes_low=.3
> > Target_dirty_bytes_high=.4
> >
> > Once the amount of dirty bytes passes the low limit a very low
> > priority
> flush
> > occurs, if the high limit is reached data is flushed much more
> aggressively.
> > The same could also exist for eviction. This will allow burst of write
> activity to
> > occur before flushing starts heavily impacting performance.
> >
> > Nick
> >
> 
> 
> 
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux