Re: Deep scrub, cache pools, replica 1

Christian Balzer <chibi@xxxxxxx> · Wed, 12 Nov 2014 07:32:08 +0900



On Tue, 11 Nov 2014 10:21:49 -0800 Gregory Farnum wrote:

> On Mon, Nov 10, 2014 at 10:58 PM, Christian Balzer <chibi@xxxxxxx> wrote:
> >
> > Hello,
> >
> > One of my clusters has become busy enough (I'm looking at you, evil
> > Window VMs that I shall banish elsewhere soon) to experience client
> > noticeable performance impacts during deep scrub.
> > Before this I instructed all OSDs to deep scrub in parallel at Saturday
> > night and that finished before Sunday morning.
> > So for now I'll fire them off one by one to reduce the load.
> >
> > Looking forward, that cluster doesn't need more space so instead of
> > adding more hosts and OSDs I was thinking of a cache pool instead.
> >
> > I suppose that will keep the clients happy while the slow pool gets
> > scrubbed.
> > Is there anybody who tested cache pools with Firefly and compared the
> > performance to Giant?
> >
> > For testing I'm currently playing with a single storage node and 8 SSD
> > backed OSDs.
> > Now what very much blew my mind is that a pool with a replication of 1
> > still does quite the impressive read orgy, clearly reading all the
> > data in the PGs.
> > Why? And what is it comparing that data with, the cosmic background
> > radiation?
> 
> Yeah, cache pools currently do full-object promotions whenever an
> object is accessed. There are some ideas and projects to improve this
> or reduce its effects, but they're mostly just getting started.
Thanks for confirming that, so probably not much better than Firefly
_aside_ from the fact that SSD pools should be quite a bit faster in and
by themselves in Giant. 
Guess there is no other way to find out than to test things, I have a
feeling that determining the "hot" working set otherwise will be rather
difficult.

> At least, I assume that's what you mean by a read orgy; perhaps you
> are seeing something else entirely?
> 
Indeed I did, this was just an observation that any pool with a replica of
1 will still read ALL the data during a deep-scrub. What good would that
do?

> Also, even on cache pools you don't really want to run with 1x
> replication as they hold the only copy of whatever data is dirty...
>
Oh, I agree, this is for testing only. 
Also a replica of 1 doesn't have to mean that the data is unsafe (the OSDs
could be RAIDed). But even though, in production the loss of a single node
shouldn't impact things. And once you go there, a replica of 2 comes
naturally.

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com