Re: RBD: How many snapshots is too many?

Florian Haas <florian@xxxxxxxxxxx> · Wed, 6 Sep 2017 17:44:07 +0200

Hi Greg,

thanks for your insight! I do have a few follow-up questions.

On 09/05/2017 11:39 PM, Gregory Farnum wrote:
>> It seems to me that there still isn't a good recommendation along the
>> lines of "try not to have more than X snapshots per RBD image" or "try
>> not to have more than Y snapshots in the cluster overall". Or is the
>> "correct" recommendation actually "create as many snapshots as you
>> might possibly want, none of that is allowed to create any instability
>> nor performance degradation and if it does, that's a bug"?
> 
> I think we're closer to "as many snapshots as you want", but there are
> some known shortages there.
> 
> First of all, if you haven't seen my talk from the last OpenStack
> summit on snapshots and you want a bunch of details, go watch that. :p
> https://www.openstack.org/videos/boston-2017/ceph-snapshots-for-fun-and-profit-1

OK so I just rewatched that to see if I had missed anything regarding
recommendations for how many snapshots are sane. For anyone else
following this thread, there are two items I could make out, and I'm
taking the liberty to include the direct links here:

- From the talk itself: https://youtu.be/rY0OWtllkn8?t=26m29s

This says don't do a snapshot every minute on each RBD, but one per day
is probably OK. That is rather *very* vague, unfortunately, since as you
point out in the talk the overhead associated with snapshots is strongly
related to how many RADOS-level snapshots there are in the cluster
overall, and clearly it makes a big difference whether you're taking one
daily snapshot of 10 RBD images, or of 100,000.

So, can you refine that estimate a bit? As in, can you give at least an
order-of-magnitude estimate for "this many snapshots overall is probably
OK, but multiply by 10 and you're in trouble"?

- From the Q&A: https://youtu.be/rY0OWtllkn8?t=36m58s

Here, you talk about how having many holes in the interval set governing
the snap trim queue can be a problem. That one is rather tricky too,
because as far as I can tell there is really no way for users to
influence this (other than, of course, deleting *all* snapshots or never
creating or deleting any at all).

> There are a few dimensions there can be failures with snapshots:
> 1) right now the way we mark snapshots as deleted is suboptimal — when
> deleted they go into an interval_set in the OSDMap. So if you have a
> bunch of holes in your deleted snapshots, it is possible to inflate
> the osdmap to a size which causes trouble. But I'm not sure if we've
> actually seen this be an issue yet — it requires both a large cluster,
> and a large map, and probably some other failure causing osdmaps to be
> generated very rapidly.

Can you give an estimate as to what a "large" map is in this context? In
other words, when is a map sufficiently inflated with that interval set
to be a problem?

> 2) There may be issues with how rbd records what snapshots it is
> associated with? No idea about this; haven't heard of any.
> 
> 3) Trimming snapshots requires IO. This is where most (all?) of the
> issues I've seen have come from; either in it being unscheduled IO
> that the rest of the system doesn't account for or throttle (as in the
> links you highlighted) or in admins overwhelming the IO capacity of
> their clusters.

Again, I think (correct me if I'm wrong here) that trimming does factor
into your "one snapshot per RBD image per day" recommendation, but would
you be able to express that in terms of overall RADOS-level snapshots?

Thanks again!

Cheers,
Florian

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com