Re: Snapshots of consistency groups

Victor Denisov <vdenisov@xxxxxxxxxxxx> · Wed, 4 Jan 2017 10:29:37 -0800

It looks like next CDM is only next month. Let's try to figure it out in email.

> Since you know which images are linked to the group and you know which
> snapshots are in the group and which group snapshots are in the image,
> you can reconcile any issues using the details in the
> GroupSnapshotNamespace -- there shouldn't be any need to depend on the
> actual snapshot name (it could technically just be a randomly assigned
> UUID).

Let's say I have a consistency group and a snapshot of this group: CG
and CGSNAP.
Images in this snapshot I'll define as:
CG.images[0] - image1
CG.images[1] - image2
CG.images[2] - image3

Image snapshots in cg snapshot will be:
CG.CGSNAP.snaps[0] - reference to snapshot of image 1
CG.CGSNAP.snaps[1] - reference to snapshot of image 2

Imagine that this snapshot was created, but wasn't finalized.
CG.CGSNAP.state == PENDING.
CG.CGSNAP.snaps.length == 0;
I'll be writing in pseudo code.

Now, let's say we want to remove this pending CGSNAP. This is the code
how it's currently implemented:

for (image: CG.images) {
  snap_name = image.id + "_" + CG.CGSNAP.id + "_" + CG.id // This name
is unique because of uniqueness of the tuple (image.id, CG.CGSNAP.id,
CG.id)
  remove_image_snapshot(snap_name);
}
remove_cg_snap(CGSNAP);

However, if we don't rely on the name then this is how I envision the code:

for (image: CG.images) {
  for (snap: image.snaps) {
    if (snap.namespace.cg_id == CG.id && snap.namespace.cg_snap_id ==
CG.CGSNAP.id) { // this is our snapshot
      remove_image_snapshot(snap.name);
    }
  }
}
remove_cg_snap(CGSNAP);

In this solution I don't like the internal loop.
What do you think?

Thanks,
Victor.

On Tue, Jan 3, 2017 at 6:56 PM, Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
> After starting the process of creating a group snapshot, you will
> already have all the necessary data for the group snapshot namespace
> [1] (group pool, group id, and group snapshot id) and the group
> snapshot should be persistently recorded to disk as
> GROUP_SNAPSHOT_STATE_PENDING.
>
> Looking at the snapshot create state machine [2], I don't see any
> place that a crash (or similar failure) would matter before the actual
> image snapshot record is created atomically. You would pass the fully
> populated GroupSnapshotNamespace to snap_create, and if the snapshot
> is created, it's linked to the group via that namespace and any
> failures afterwards don't matter since they are linked -- if the
> snapshot fails to be created, it isn't linked to the group but the
> snapshot doesn't exist either so there isn't anything to clean up.
>
> Since you know which images are linked to the group and you know which
> snapshots are in the group and which group snapshots are in the image,
> you can reconcile any issues using the details in the
> GroupSnapshotNamespace -- there shouldn't be any need to depend on the
> actual snapshot name (it could technically just be a randomly assigned
> UUID).
>
> Perhaps we could talk about this at a future RBD standup meeting that
> you are able to join (or the next CDM).
>
> [1] https://github.com/ceph/ceph/blob/master/src/cls/rbd/cls_rbd_types.h#L249
> [2] https://github.com/ceph/ceph/blob/master/src/librbd/operation/SnapshotCreateRequest.h#L28
>
> On Tue, Jan 3, 2017 at 7:40 PM, Victor Denisov <vdenisov@xxxxxxxxxxxx> wrote:
>> Let's say we start creating a group snapshot.
>> We invoke async snap_create method in Operations class.
>> When we invoke this method we provide it with the snapshot name.
>>
>> While we are wating for the response we can be aborted.
>> As a result we will be able to find the exact image snapshot using only its name
>> as this was the only information we had at the time of running
>> snap_create method.
>>
>> If snap_create was successful we will be able to find the snapshot
>> otherwise we will not.
>> However if we allow renaming snapshots from GroupSnapshotNamespace
>> then we may not find the snapshot even if it
>> was created successfully.
>>
>> On Fri, Dec 16, 2016 at 6:53 AM, Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
>>> Can you give a little background on this specific inconsistent case
>>> you are referring to?
>>>
>>> On Thu, Dec 15, 2016 at 7:05 PM, Victor Denisov <vdenisov@xxxxxxxxxxxx> wrote:
>>>> Yes, but if image's snapshot is renamed then I'm not able to find this
>>>> snapshot having only group's snapshot in an inconsistent state for
>>>> example.
>>>>
>>>> V.
>>>>
>>>> On Thu, Dec 15, 2016 at 7:10 AM, Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
>>>>> I think I might be confused. When creating a group snapshot, we have
>>>>> the ConsistencyGroupSnapshot that allows you to store the necessary
>>>>> linkage between the image's snapshot and its associated group snapshot
>>>>> [1]. Why not just name the image's snapshots to the same name as the
>>>>> parent group snapshot name and search the snapshot's metadata to get
>>>>> the linkage?
>>>>>
>>>>> [1] https://github.com/ceph/ceph/blob/master/src/cls/rbd/cls_rbd_types.h#L255
>>>>>
>>>>> On Tue, Dec 13, 2016 at 8:03 PM, Victor Denisov <vdenisov@xxxxxxxxxxxx> wrote:
>>>>>> Jason,
>>>>>>
>>>>>> My current implementation of consistency group snapshot feature names
>>>>>> image snapshots like: <group_pool>_<group_id>_<group_snap_id>
>>>>>> I rely on this fact when I need to remove a consistency group. It's
>>>>>> necessary because if some of image snapshots were created, but the
>>>>>> whole group snapshot operation failed,
>>>>>> then the only way to find those dangling image snapshots is by this name.
>>>>>>
>>>>>> It means that we should forbid renaming snapshots from
>>>>>> ConsistencyGroupSnapshot namespace.
>>>>>>
>>>>>> Another option is to allocate image snapshot ids during the creation
>>>>>> of group snapshot, but this requires a major rewrite of the whole
>>>>>> process of snapshot creation for images.
>>>>>>
>>>>>> What is your opinion on this?
>>>>>>
>>>>>> Thanks,
>>>>>> V.
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 1, 2016 at 8:01 AM, Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
>>>>>>> I chatted with Xing on IRC this morning re: Cinder generic groups. It
>>>>>>> sounds like RBD will need to support preserving the image's
>>>>>>> consistency group snapshots even if the image is removed from the
>>>>>>> group. In the OpenStack case, you won't have to worry about the image
>>>>>>> being deleted while it still has associated group snapshots.
>>>>>>>
>>>>>>> We will also want to support being able to clone child images from a
>>>>>>> group snapshot to ensure that we can thin provision new groups volumes
>>>>>>> when creating a new group from a group snapshot. This means that the
>>>>>>> group snapshots should be able to be protected/unprotected just like
>>>>>>> standard user snapshots.
>>>>>>>
>>>>>>> On Mon, Oct 31, 2016 at 9:07 AM, Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
>>>>>>>> Looking at the Cinder codebase, I don't see any such restriction that
>>>>>>>> would prevent you from removing a volume from a consistency group that
>>>>>>>> has associated snapshots. I would double-check on the OpenStack
>>>>>>>> development mailing list if this is correct and is the intent. Worst
>>>>>>>> case, the RBD driver could raise an exception if there are still
>>>>>>>> consistency group snapshots associated to the image.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Oct 28, 2016 at 6:41 PM, Victor Denisov <vdenisov@xxxxxxxxxxxx> wrote:
>>>>>>>>> Another thing that bothers me. When we remove an image from a consistency group.
>>>>>>>>> Should we remove all snapshots of this image that were created as part
>>>>>>>>> of a consistency group snapshot?
>>>>>>>>>
>>>>>>>>> The easiest solution would be to remove all snapshots that are in
>>>>>>>>> GroupSnapshotNamespace and reference this consistency group.
>>>>>>>>> I looked into cinder docs for this feature:
>>>>>>>>> http://docs.openstack.org/admin-guide/blockstorage-consistency-groups.html
>>>>>>>>>
>>>>>>>>> But it's not clear to me which behavior cinder expects.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> V.
>>>>>>>>>
>>>>>>>>> On Wed, Oct 26, 2016 at 6:16 AM, Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
>>>>>>>>>> In a perfect world, it would be nice to add a new optional to "rbd
>>>>>>>>>> snap ls" to show all snapshots (with a new column to indicate the
>>>>>>>>>> associated namespace).
>>>>>>>>>>
>>>>>>>>>> On Tue, Oct 25, 2016 at 11:07 PM, Victor Denisov <vdenisov@xxxxxxxxxxxx> wrote:
>>>>>>>>>>> Question. When we print out snapshots of an image, should the group
>>>>>>>>>>> snapshots be listed, or should they be marked as special snapshots?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> V.
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Oct 10, 2016 at 3:14 PM, Victor Denisov <vdenisov@xxxxxxxxxxxx> wrote:
>>>>>>>>>>>> Ok. I didn't have any intention to throw exceptions.
>>>>>>>>>>>> I was more concerned about whether it's ok to allocate and delete
>>>>>>>>>>>> objects or I should use smart pointers.
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Oct 10, 2016 at 7:18 AM, Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
>>>>>>>>>>>>> The only place exceptions are routinely used is within the "::decode"
>>>>>>>>>>>>> functions. I would prefer to see the code not throwing new exceptions
>>>>>>>>>>>>> on purpose.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Oct 7, 2016 at 2:26 PM, Victor Denisov <vdenisov@xxxxxxxxxxxx> wrote:
>>>>>>>>>>>>>> Are any exceptions used in librbd code? Should the code be exception safe?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> V.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Sep 16, 2016 at 10:37 AM, Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
>>>>>>>>>>>>>>> On Thu, Sep 15, 2016 at 7:17 PM, Victor Denisov <vdenisov@xxxxxxxxxxxx> wrote:
>>>>>>>>>>>>>>>> if (struct_v >= 5) {
>>>>>>>>>>>>>>>>       ::decode(snapshot_namespace, p);
>>>>>>>>>>>>>>>>     } else {
>>>>>>>>>>>>>>>>       snapshot_namespace = cls::rbd::UserSnapshotNamespace();
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> then code for ::encode function of cls_rbd_snap would change accordingly:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> instead of
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> boost::apply_visitor(cls::rbd::EncodeSnapshotTypeVisitor(bl),
>>>>>>>>>>>>>>>> snapshot_namespace);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I would do:
>>>>>>>>>>>>>>>> ::encode(snapshot_namespace, bl);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +1 -- looks good to me
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Jason
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Jason
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Jason
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jason
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jason
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jason
>>>
>>>
>>>
>>> --
>>> Jason
>
>
>
> --
> Jason
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html