Re: RBD-Mirror Snapshot Backup Image Uses

Adam Boyhan <adamb@xxxxxxxxxx> · Fri, 22 Jan 2021 13:50:42 -0500 (EST)

I have been doing a lot of testing. 

The size of the RBD image doesn't have any effect. 

I run into the issue once I actually write data to the rbd. The more data I write out, the larger the chance of reproducing the issue. 

I seem to hit the issue of missing the filesystem all together the most, but I have also had a few instances where some of the data was simply missing. 

I monitor the mirror status on the remote cluster until the snapshot is 100% copied and also make sure all the IO is done. My setup has no issue maxing out my 10G interconnect during replication, so its pretty obvious once its done. 

The only way I have found to resolve the issue is to call a mirror resync on the secondary array. 

I can then map the rbd on the primary, write more data to it, snap it again, and I am back in the same position. 

From: "adamb" <adamb@xxxxxxxxxx> 
To: "dillaman" <dillaman@xxxxxxxxxx> 
Cc: "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> 
Sent: Thursday, January 21, 2021 3:11:31 PM 
Subject:  Re: RBD-Mirror Snapshot Backup Image Uses 

Sure thing. 

root@Bunkcephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-1 
SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE 
12192 TestSnapper1 2 TiB Thu Jan 21 14:15:02 2021 user 
12595 .mirror.non_primary.a04e92df-3d64-4dc4-8ac8-eaba17b45403.34c4a53e-9525-446c-8de6-409ea93c5edd 2 TiB Thu Jan 21 15:05:02 2021 mirror (non-primary peer_uuids:[] 6c26557e-d011-47b1-8c99-34cf6e0c7f2f:12801 copied) 

root@Bunkcephtest1:~# rbd mirror image status CephTestPool1/vm-100-disk-1 
vm-100-disk-1: 
global_id: a04e92df-3d64-4dc4-8ac8-eaba17b45403 
state: up+replaying 
description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1611259501,"remote_snapshot_timestamp":1611259501,"replay_state":"idle"} 
service: admin on Bunkcephmon1 
last_update: 2021-01-21 15:06:24 
peer_sites: 
name: ccs 
state: up+stopped 
description: local image is primary 
last_update: 2021-01-21 15:06:23 

root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper1 CephTestPool1/vm-100-disk-1-CLONE 
root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE 
/dev/nbd0 
root@Bunkcephtest1:~# blkid /dev/nbd0 
root@Bunkcephtest1:~# mount /dev/nbd0 /usr2 
mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, missing codepage or helper program, or other error. 

Primary still looks good. 

root@Ccscephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper1 CephTestPool1/vm-100-disk-1-CLONE 
root@Ccscephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE 
/dev/nbd0 
root@Ccscephtest1:~# blkid /dev/nbd0 
/dev/nbd0: UUID="830b8e05-d5c1-481d-896d-14e21d17017d" TYPE="ext4" 
root@Ccscephtest1:~# mount /dev/nbd0 /usr2 
root@Ccscephtest1:~# cat /proc/mounts | grep nbd0 
/dev/nbd0 /usr2 ext4 rw,relatime 0 0 

From: "Jason Dillaman" <jdillama@xxxxxxxxxx> 
To: "adamb" <adamb@xxxxxxxxxx> 
Cc: "Eugen Block" <eblock@xxxxxx>, "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> 
Sent: Thursday, January 21, 2021 3:01:46 PM 
Subject: Re:  Re: RBD-Mirror Snapshot Backup Image Uses 

On Thu, Jan 21, 2021 at 11:51 AM Adam Boyhan <adamb@xxxxxxxxxx> wrote: 
> 
> I was able to trigger the issue again. 
> 
> - On the primary I created a snap called TestSnapper for disk vm-100-disk-1 
> - Allowed the next RBD-Mirror scheduled snap to complete 
> - At this point the snapshot is showing up on the remote side. 
> 
> root@Bunkcephtest1:~# rbd mirror image status CephTestPool1/vm-100-disk-1 
> vm-100-disk-1: 
> global_id: a04e92df-3d64-4dc4-8ac8-eaba17b45403 
> state: up+replaying 
> description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1611247200,"remote_snapshot_timestamp":1611247200,"replay_state":"idle"} 
> service: admin on Bunkcephmon1 
> last_update: 2021-01-21 11:46:24 
> peer_sites: 
> name: ccs 
> state: up+stopped 
> description: local image is primary 
> last_update: 2021-01-21 11:46:28 
> 
> root@Ccscephtest1:/etc/pve/priv# rbd snap ls --all CephTestPool1/vm-100-disk-1 
> SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE 
> 11532 TestSnapper 2 TiB Thu Jan 21 11:21:25 2021 user 
> 11573 .mirror.primary.a04e92df-3d64-4dc4-8ac8-eaba17b45403.9525e4eb-41c0-499c-8879-0c7d9576e253 2 TiB Thu Jan 21 11:35:00 2021 mirror (primary peer_uuids:[debf975b-ebb8-432c-a94a-d3b101e0f770]) 
> 
> Seems like the sync is complete, So I then clone it, map it and attempt to mount it. 

Can you run "snap ls --all" on the non-primary cluster? The 
non-primary snapshot will list its status. On my cluster (with a much 
smaller image): 

# 
# CLUSTER 1 
# 
$ rbd --cluster cluster1 create --size 1G mirror/image1 
$ rbd --cluster cluster1 mirror image enable mirror/image1 snapshot 
Mirroring enabled 
$ rbd --cluster cluster1 device map -t nbd mirror/image1 
/dev/nbd0 
$ mkfs.ext4 /dev/nbd0 
mke2fs 1.45.5 (07-Jan-2020) 
Discarding device blocks: done 
Creating filesystem with 262144 4k blocks and 65536 inodes 
Filesystem UUID: 50e0da12-1f99-4d45-b6e6-5f7a7decaeff 
Superblock backups stored on blocks: 
32768, 98304, 163840, 229376 

Allocating group tables: done 
Writing inode tables: done 
Creating journal (8192 blocks): done 
Writing superblocks and filesystem accounting information: done 
$ blkid /dev/nbd0 
/dev/nbd0: UUID="50e0da12-1f99-4d45-b6e6-5f7a7decaeff" 
BLOCK_SIZE="4096" TYPE="ext4" 
$ rbd --cluster cluster1 snap create mirror/image1@fs 
Creating snap: 100% complete...done. 
$ rbd --cluster cluster1 mirror image snapshot mirror/image1 
Snapshot ID: 6 
$ rbd --cluster cluster1 snap ls --all mirror/image1 
SNAPID NAME 
SIZE PROTECTED TIMESTAMP 
NAMESPACE 
5 fs 
1 GiB Thu Jan 21 14:50:24 2021 
user 
6 .mirror.primary.f9f692b8-2405-416c-9247-5628e303947a.39722e17-f7e6-4050-acf0-3842a5620d81 
1 GiB Thu Jan 21 14:50:51 2021 mirror (primary 
peer_uuids:[cd643f30-4982-4caf-874d-cf21f6f4b66f]) 

# 
# CLUSTER 2 
# 

$ rbd --cluster cluster2 mirror image status mirror/image1 
image1: 
global_id: f9f692b8-2405-416c-9247-5628e303947a 
state: up+replaying 
description: replaying, 
{"bytes_per_second":1140872.53,"bytes_per_snapshot":17113088.0,"local_snapshot_timestamp":1611258651,"remote_snapshot_timestamp":1611258651,"replay_state":"idle"} 
service: mirror.0 on cube-1 
last_update: 2021-01-21 14:51:18 
peer_sites: 
name: cluster1 
state: up+stopped 
description: local image is primary 
last_update: 2021-01-21 14:51:27 
$ rbd --cluster cluster2 snap ls --all mirror/image1 
SNAPID NAME 
SIZE PROTECTED TIMESTAMP 
NAMESPACE 
5 fs 
1 GiB Thu Jan 21 14:50:52 
2021 user 
6 .mirror.non_primary.f9f692b8-2405-416c-9247-5628e303947a.0a13b822-0508-47d6-a460-a8cc4e012686 
1 GiB Thu Jan 21 14:50:53 2021 mirror (non-primary 
peer_uuids:[] 9824df2b-86c4-4264-a47e-cf968efd09e1:6 copied) 
$ rbd --cluster cluster2 --rbd-default-clone-format 2 clone 
mirror/image1@fs mirror/image2 
$ rbd --cluster cluster2 device map -t nbd mirror/image2 
/dev/nbd1 
$ blkid /dev/nbd1 
/dev/nbd1: UUID="50e0da12-1f99-4d45-b6e6-5f7a7decaeff" 
BLOCK_SIZE="4096" TYPE="ext4" 
$ mount /dev/nbd1 /mnt/ 
$ mount | grep nbd 
/dev/nbd1 on /mnt type ext4 (rw,relatime,seclabel) 

> root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper CephTestPool1/vm-100-disk-1-CLONE 
> root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE --id admin --keyring /etc/ceph/ceph.client.admin.keyring 
> /dev/nbd0 
> root@Bunkcephtest1:~# mount /dev/nbd0 /usr2 
> mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, missing codepage or helper program, or other error. 
> 
> On the primary still no issues 
> 
> root@Ccscephtest1:/etc/pve/priv# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper CephTestPool1/vm-100-disk-1-CLONE 
> root@Ccscephtest1:/etc/pve/priv# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE --id admin --keyring /etc/ceph/ceph.client.admin.keyring 
> /dev/nbd0 
> root@Ccscephtest1:/etc/pve/priv# mount /dev/nbd0 /usr2 
> 
> 
> 
> 
> 
> ________________________________ 
> From: "Jason Dillaman" <jdillama@xxxxxxxxxx> 
> To: "adamb" <adamb@xxxxxxxxxx> 
> Cc: "Eugen Block" <eblock@xxxxxx>, "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> 
> Sent: Thursday, January 21, 2021 9:42:26 AM 
> Subject: Re:  Re: RBD-Mirror Snapshot Backup Image Uses 
> 
> On Thu, Jan 21, 2021 at 9:40 AM Adam Boyhan <adamb@xxxxxxxxxx> wrote: 
> > 
> > After the resync finished. I can mount it now. 
> > 
> > root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool1/vm-100-disk-0-CLONE 
> > root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE --id admin --keyring /etc/ceph/ceph.client.admin.keyring 
> > /dev/nbd0 
> > root@Bunkcephtest1:~# mount /dev/nbd0 /usr2 
> > 
> > Makes me a bit nervous how it got into that position and everything appeared ok. 
> 
> We unfortunately need to create the snapshots that are being synced as 
> a first step, but perhaps there are some extra guardrails we can put 
> on the system to prevent premature usage if the sync status doesn't 
> indicate that it's complete. 
> 
> > ________________________________ 
> > From: "Jason Dillaman" <jdillama@xxxxxxxxxx> 
> > To: "adamb" <adamb@xxxxxxxxxx> 
> > Cc: "Eugen Block" <eblock@xxxxxx>, "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> 
> > Sent: Thursday, January 21, 2021 9:25:11 AM 
> > Subject: Re:  Re: RBD-Mirror Snapshot Backup Image Uses 
> > 
> > On Thu, Jan 21, 2021 at 8:34 AM Adam Boyhan <adamb@xxxxxxxxxx> wrote: 
> > > 
> > > When cloning the snapshot on the remote cluster I can't see my ext4 filesystem. 
> > > 
> > > Using the same exact snapshot on both sides. Shouldn't this be consistent? 
> > 
> > Yes. Has the replication process completed ("rbd mirror image status 
> > CephTestPool1/vm-100-disk-0")? 
> > 
> > > Primary Site 
> > > root@Ccscephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-0 | grep TestSnapper1 
> > > 10621 TestSnapper1 2 TiB Thu Jan 21 08:15:22 2021 user 
> > > 
> > > root@Ccscephtest1:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool1/vm-100-disk-0-CLONE 
> > > root@Ccscephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE --id admin --keyring /etc/ceph/ceph.client.admin.keyring 
> > > /dev/nbd0 
> > > root@Ccscephtest1:~# mount /dev/nbd0 /usr2 
> > > 
> > > Secondary Site 
> > > root@Bunkcephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-0 | grep TestSnapper1 
> > > 10430 TestSnapper1 2 TiB Thu Jan 21 08:20:08 2021 user 
> > > 
> > > root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool1/vm-100-disk-0-CLONE 
> > > root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE --id admin --keyring /etc/ceph/ceph.client.admin.keyring 
> > > /dev/nbd0 
> > > root@Bunkcephtest1:~# mount /dev/nbd0 /usr2 
> > > mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, missing codepage or helper program, or other error. 
> > > 
> > > 
> > > 
> > > ________________________________ 
> > > From: "adamb" <adamb@xxxxxxxxxx> 
> > > To: "dillaman" <dillaman@xxxxxxxxxx> 
> > > Cc: "Eugen Block" <eblock@xxxxxx>, "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> 
> > > Sent: Wednesday, January 20, 2021 3:42:46 PM 
> > > Subject: Re:  Re: RBD-Mirror Snapshot Backup Image Uses 
> > > 
> > > Awesome information. I new I had to be missing something. 
> > > 
> > > All of my clients will be far newer than mimic so I don't think that will be an issue. 
> > > 
> > > Added the following to my ceph.conf on both clusters. 
> > > 
> > > rbd_default_clone_format = 2 
> > > 
> > > root@Bunkcephmon2:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool2/vm-100-disk-0-CLONE 
> > > root@Bunkcephmon2:~# rbd ls CephTestPool2 
> > > vm-100-disk-0-CLONE 
> > > 
> > > I am sure I will be back with more questions. Hoping to replace our Nimble storage with Ceph and NVMe. 
> > > 
> > > Appreciate it! 
> > > 
> > > ________________________________ 
> > > From: "Jason Dillaman" <jdillama@xxxxxxxxxx> 
> > > To: "adamb" <adamb@xxxxxxxxxx> 
> > > Cc: "Eugen Block" <eblock@xxxxxx>, "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> 
> > > Sent: Wednesday, January 20, 2021 3:28:39 PM 
> > > Subject: Re:  Re: RBD-Mirror Snapshot Backup Image Uses 
> > > 
> > > On Wed, Jan 20, 2021 at 3:10 PM Adam Boyhan <adamb@xxxxxxxxxx> wrote: 
> > > > 
> > > > That's what I though as well, specially based on this. 
> > > > 
> > > > 
> > > > 
> > > > Note 
> > > > 
> > > > You may clone a snapshot from one pool to an image in another pool. For example, you may maintain read-only images and snapshots as templates in one pool, and writeable clones in another pool. 
> > > > 
> > > > root@Bunkcephmon2:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool2/vm-100-disk-0-CLONE 
> > > > 2021-01-20T15:06:35.854-0500 7fb889ffb700 -1 librbd::image::CloneRequest: 0x55c7cf8417f0 validate_parent: parent snapshot must be protected 
> > > > 
> > > > root@Bunkcephmon2:~# rbd snap protect CephTestPool1/vm-100-disk-0@TestSnapper1 
> > > > rbd: protecting snap failed: (30) Read-only file system 
> > > 
> > > You have two options: (1) protect the snapshot on the primary image so 
> > > that the protection status replicates or (2) utilize RBD clone v2 
> > > which doesn't require protection but does require Mimic or later 
> > > clients [1]. 
> > > 
> > > > 
> > > > From: "Eugen Block" <eblock@xxxxxx> 
> > > > To: "adamb" <adamb@xxxxxxxxxx> 
> > > > Cc: "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> 
> > > > Sent: Wednesday, January 20, 2021 3:00:54 PM 
> > > > Subject: Re:  Re: RBD-Mirror Snapshot Backup Image Uses 
> > > > 
> > > > But you should be able to clone the mirrored snapshot on the remote 
> > > > cluster even though it’s not protected, IIRC. 
> > > > 
> > > > 
> > > > Zitat von Adam Boyhan <adamb@xxxxxxxxxx>: 
> > > > 
> > > > > Two separate 4 node clusters with 10 OSD's in each node. Micron 9300 
> > > > > NVMe's are the OSD drives. Heavily based on the Micron/Supermicro 
> > > > > white papers. 
> > > > > 
> > > > > When I attempt to protect the snapshot on a remote image, it errors 
> > > > > with read only. 
> > > > > 
> > > > > root@Bunkcephmon2:~# rbd snap protect 
> > > > > CephTestPool1/vm-100-disk-0@TestSnapper1 
> > > > > rbd: protecting snap failed: (30) Read-only file system 
> > > > > _______________________________________________ 
> > > > > ceph-users mailing list -- ceph-users@xxxxxxx 
> > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx 
> > > > _______________________________________________ 
> > > > ceph-users mailing list -- ceph-users@xxxxxxx 
> > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx 
> > > 
> > > [1] https://ceph.io/community/new-mimic-simplified-rbd-image-cloning/ 
> > > 
> > > -- 
> > > Jason 
> > > 
> > 
> > 
> > -- 
> > Jason 
> 
> 
> 
> -- 
> Jason 

-- 
Jason 
_______________________________________________ 
ceph-users mailing list -- ceph-users@xxxxxxx 
To unsubscribe send an email to ceph-users-leave@xxxxxxx 
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx