I will have to do some looking into how that is done on Proxmox, but most definitely. From: "Jason Dillaman" <jdillama@xxxxxxxxxx> To: "adamb" <adamb@xxxxxxxxxx> Cc: "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> Sent: Friday, January 22, 2021 3:02:23 PM Subject: Re: Re: RBD-Mirror Snapshot Backup Image Uses Any chance you can attempt to repeat the process on the latest master or pacific branch clients (no need to upgrade the MONs/OSDs)? On Fri, Jan 22, 2021 at 2:32 PM Adam Boyhan <adamb@xxxxxxxxxx> wrote: > > The steps are pretty straight forward. > > - Create rbd image of 500G on the primary > - Enable rbd-mirror snapshot on the image > - Map the image on the primary > - Format the block device with ext4 > - Mount it and write out 200-300G worth of data (I am using rsync with some local real data we have) > - Unmap the image from the primary > - Create rdb snapshot > - Create rdb mirror snapshot > - Wait for copy process to complete > - Clone the rdb snapshot on secondary > - Map the image on secondary > - Try to mount on secondary > > Just as a reference. All of my nodes are the same. > > root@Bunkcephtest1:~# ceph --version > ceph version 15.2.8 (8b89984e92223ec320fb4c70589c39f384c86985) octopus (stable) > > root@Bunkcephtest1:~# dpkg -l | grep rbd-mirror > ii rbd-mirror 15.2.8-pve2 amd64 Ceph daemon for mirroring RBD images > > This is pretty straight forward, I don't know what I could be missing here. > > > ________________________________ > From: "Jason Dillaman" <jdillama@xxxxxxxxxx> > To: "adamb" <adamb@xxxxxxxxxx> > Cc: "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> > Sent: Friday, January 22, 2021 2:11:36 PM > Subject: Re: Re: RBD-Mirror Snapshot Backup Image Uses > > Any chance you could write a small reproducer test script? I can't > repeat what you are seeing and we do have test cases that really > hammer random IO on primary images, create snapshots, rinse-and-repeat > and they haven't turned up anything yet. > > Thanks! > > On Fri, Jan 22, 2021 at 1:50 PM Adam Boyhan <adamb@xxxxxxxxxx> wrote: > > > > I have been doing a lot of testing. > > > > The size of the RBD image doesn't have any effect. > > > > I run into the issue once I actually write data to the rbd. The more data I write out, the larger the chance of reproducing the issue. > > > > I seem to hit the issue of missing the filesystem all together the most, but I have also had a few instances where some of the data was simply missing. > > > > I monitor the mirror status on the remote cluster until the snapshot is 100% copied and also make sure all the IO is done. My setup has no issue maxing out my 10G interconnect during replication, so its pretty obvious once its done. > > > > The only way I have found to resolve the issue is to call a mirror resync on the secondary array. > > > > I can then map the rbd on the primary, write more data to it, snap it again, and I am back in the same position. > > > > ________________________________ > > From: "adamb" <adamb@xxxxxxxxxx> > > To: "dillaman" <dillaman@xxxxxxxxxx> > > Cc: "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> > > Sent: Thursday, January 21, 2021 3:11:31 PM > > Subject: Re: RBD-Mirror Snapshot Backup Image Uses > > > > Sure thing. > > > > root@Bunkcephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-1 > > SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE > > 12192 TestSnapper1 2 TiB Thu Jan 21 14:15:02 2021 user > > 12595 .mirror.non_primary.a04e92df-3d64-4dc4-8ac8-eaba17b45403.34c4a53e-9525-446c-8de6-409ea93c5edd 2 TiB Thu Jan 21 15:05:02 2021 mirror (non-primary peer_uuids:[] 6c26557e-d011-47b1-8c99-34cf6e0c7f2f:12801 copied) > > > > > > root@Bunkcephtest1:~# rbd mirror image status CephTestPool1/vm-100-disk-1 > > vm-100-disk-1: > > global_id: a04e92df-3d64-4dc4-8ac8-eaba17b45403 > > state: up+replaying > > description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1611259501,"remote_snapshot_timestamp":1611259501,"replay_state":"idle"} > > service: admin on Bunkcephmon1 > > last_update: 2021-01-21 15:06:24 > > peer_sites: > > name: ccs > > state: up+stopped > > description: local image is primary > > last_update: 2021-01-21 15:06:23 > > > > > > root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper1 CephTestPool1/vm-100-disk-1-CLONE > > root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE > > /dev/nbd0 > > root@Bunkcephtest1:~# blkid /dev/nbd0 > > root@Bunkcephtest1:~# mount /dev/nbd0 /usr2 > > mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, missing codepage or helper program, or other error. > > > > > > Primary still looks good. > > > > root@Ccscephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper1 CephTestPool1/vm-100-disk-1-CLONE > > root@Ccscephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE > > /dev/nbd0 > > root@Ccscephtest1:~# blkid /dev/nbd0 > > /dev/nbd0: UUID="830b8e05-d5c1-481d-896d-14e21d17017d" TYPE="ext4" > > root@Ccscephtest1:~# mount /dev/nbd0 /usr2 > > root@Ccscephtest1:~# cat /proc/mounts | grep nbd0 > > /dev/nbd0 /usr2 ext4 rw,relatime 0 0 > > > > > > > > > > > > > > From: "Jason Dillaman" <jdillama@xxxxxxxxxx> > > To: "adamb" <adamb@xxxxxxxxxx> > > Cc: "Eugen Block" <eblock@xxxxxx>, "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> > > Sent: Thursday, January 21, 2021 3:01:46 PM > > Subject: Re: Re: RBD-Mirror Snapshot Backup Image Uses > > > > On Thu, Jan 21, 2021 at 11:51 AM Adam Boyhan <adamb@xxxxxxxxxx> wrote: > > > > > > I was able to trigger the issue again. > > > > > > - On the primary I created a snap called TestSnapper for disk vm-100-disk-1 > > > - Allowed the next RBD-Mirror scheduled snap to complete > > > - At this point the snapshot is showing up on the remote side. > > > > > > root@Bunkcephtest1:~# rbd mirror image status CephTestPool1/vm-100-disk-1 > > > vm-100-disk-1: > > > global_id: a04e92df-3d64-4dc4-8ac8-eaba17b45403 > > > state: up+replaying > > > description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1611247200,"remote_snapshot_timestamp":1611247200,"replay_state":"idle"} > > > service: admin on Bunkcephmon1 > > > last_update: 2021-01-21 11:46:24 > > > peer_sites: > > > name: ccs > > > state: up+stopped > > > description: local image is primary > > > last_update: 2021-01-21 11:46:28 > > > > > > root@Ccscephtest1:/etc/pve/priv# rbd snap ls --all CephTestPool1/vm-100-disk-1 > > > SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE > > > 11532 TestSnapper 2 TiB Thu Jan 21 11:21:25 2021 user > > > 11573 .mirror.primary.a04e92df-3d64-4dc4-8ac8-eaba17b45403.9525e4eb-41c0-499c-8879-0c7d9576e253 2 TiB Thu Jan 21 11:35:00 2021 mirror (primary peer_uuids:[debf975b-ebb8-432c-a94a-d3b101e0f770]) > > > > > > Seems like the sync is complete, So I then clone it, map it and attempt to mount it. > > > > Can you run "snap ls --all" on the non-primary cluster? The > > non-primary snapshot will list its status. On my cluster (with a much > > smaller image): > > > > # > > # CLUSTER 1 > > # > > $ rbd --cluster cluster1 create --size 1G mirror/image1 > > $ rbd --cluster cluster1 mirror image enable mirror/image1 snapshot > > Mirroring enabled > > $ rbd --cluster cluster1 device map -t nbd mirror/image1 > > /dev/nbd0 > > $ mkfs.ext4 /dev/nbd0 > > mke2fs 1.45.5 (07-Jan-2020) > > Discarding device blocks: done > > Creating filesystem with 262144 4k blocks and 65536 inodes > > Filesystem UUID: 50e0da12-1f99-4d45-b6e6-5f7a7decaeff > > Superblock backups stored on blocks: > > 32768, 98304, 163840, 229376 > > > > Allocating group tables: done > > Writing inode tables: done > > Creating journal (8192 blocks): done > > Writing superblocks and filesystem accounting information: done > > $ blkid /dev/nbd0 > > /dev/nbd0: UUID="50e0da12-1f99-4d45-b6e6-5f7a7decaeff" > > BLOCK_SIZE="4096" TYPE="ext4" > > $ rbd --cluster cluster1 snap create mirror/image1@fs > > Creating snap: 100% complete...done. > > $ rbd --cluster cluster1 mirror image snapshot mirror/image1 > > Snapshot ID: 6 > > $ rbd --cluster cluster1 snap ls --all mirror/image1 > > SNAPID NAME > > SIZE PROTECTED TIMESTAMP > > NAMESPACE > > 5 fs > > 1 GiB Thu Jan 21 14:50:24 2021 > > user > > 6 .mirror.primary.f9f692b8-2405-416c-9247-5628e303947a.39722e17-f7e6-4050-acf0-3842a5620d81 > > 1 GiB Thu Jan 21 14:50:51 2021 mirror (primary > > peer_uuids:[cd643f30-4982-4caf-874d-cf21f6f4b66f]) > > > > # > > # CLUSTER 2 > > # > > > > $ rbd --cluster cluster2 mirror image status mirror/image1 > > image1: > > global_id: f9f692b8-2405-416c-9247-5628e303947a > > state: up+replaying > > description: replaying, > > {"bytes_per_second":1140872.53,"bytes_per_snapshot":17113088.0,"local_snapshot_timestamp":1611258651,"remote_snapshot_timestamp":1611258651,"replay_state":"idle"} > > service: mirror.0 on cube-1 > > last_update: 2021-01-21 14:51:18 > > peer_sites: > > name: cluster1 > > state: up+stopped > > description: local image is primary > > last_update: 2021-01-21 14:51:27 > > $ rbd --cluster cluster2 snap ls --all mirror/image1 > > SNAPID NAME > > SIZE PROTECTED TIMESTAMP > > NAMESPACE > > 5 fs > > 1 GiB Thu Jan 21 14:50:52 > > 2021 user > > 6 .mirror.non_primary.f9f692b8-2405-416c-9247-5628e303947a.0a13b822-0508-47d6-a460-a8cc4e012686 > > 1 GiB Thu Jan 21 14:50:53 2021 mirror (non-primary > > peer_uuids:[] 9824df2b-86c4-4264-a47e-cf968efd09e1:6 copied) > > $ rbd --cluster cluster2 --rbd-default-clone-format 2 clone > > mirror/image1@fs mirror/image2 > > $ rbd --cluster cluster2 device map -t nbd mirror/image2 > > /dev/nbd1 > > $ blkid /dev/nbd1 > > /dev/nbd1: UUID="50e0da12-1f99-4d45-b6e6-5f7a7decaeff" > > BLOCK_SIZE="4096" TYPE="ext4" > > $ mount /dev/nbd1 /mnt/ > > $ mount | grep nbd > > /dev/nbd1 on /mnt type ext4 (rw,relatime,seclabel) > > > > > root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper CephTestPool1/vm-100-disk-1-CLONE > > > root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE --id admin --keyring /etc/ceph/ceph.client.admin.keyring > > > /dev/nbd0 > > > root@Bunkcephtest1:~# mount /dev/nbd0 /usr2 > > > mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, missing codepage or helper program, or other error. > > > > > > On the primary still no issues > > > > > > root@Ccscephtest1:/etc/pve/priv# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper CephTestPool1/vm-100-disk-1-CLONE > > > root@Ccscephtest1:/etc/pve/priv# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE --id admin --keyring /etc/ceph/ceph.client.admin.keyring > > > /dev/nbd0 > > > root@Ccscephtest1:/etc/pve/priv# mount /dev/nbd0 /usr2 > > > > > > > > > > > > > > > > > > ________________________________ > > > From: "Jason Dillaman" <jdillama@xxxxxxxxxx> > > > To: "adamb" <adamb@xxxxxxxxxx> > > > Cc: "Eugen Block" <eblock@xxxxxx>, "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> > > > Sent: Thursday, January 21, 2021 9:42:26 AM > > > Subject: Re: Re: RBD-Mirror Snapshot Backup Image Uses > > > > > > On Thu, Jan 21, 2021 at 9:40 AM Adam Boyhan <adamb@xxxxxxxxxx> wrote: > > > > > > > > After the resync finished. I can mount it now. > > > > > > > > root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool1/vm-100-disk-0-CLONE > > > > root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE --id admin --keyring /etc/ceph/ceph.client.admin.keyring > > > > /dev/nbd0 > > > > root@Bunkcephtest1:~# mount /dev/nbd0 /usr2 > > > > > > > > Makes me a bit nervous how it got into that position and everything appeared ok. > > > > > > We unfortunately need to create the snapshots that are being synced as > > > a first step, but perhaps there are some extra guardrails we can put > > > on the system to prevent premature usage if the sync status doesn't > > > indicate that it's complete. > > > > > > > ________________________________ > > > > From: "Jason Dillaman" <jdillama@xxxxxxxxxx> > > > > To: "adamb" <adamb@xxxxxxxxxx> > > > > Cc: "Eugen Block" <eblock@xxxxxx>, "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> > > > > Sent: Thursday, January 21, 2021 9:25:11 AM > > > > Subject: Re: Re: RBD-Mirror Snapshot Backup Image Uses > > > > > > > > On Thu, Jan 21, 2021 at 8:34 AM Adam Boyhan <adamb@xxxxxxxxxx> wrote: > > > > > > > > > > When cloning the snapshot on the remote cluster I can't see my ext4 filesystem. > > > > > > > > > > Using the same exact snapshot on both sides. Shouldn't this be consistent? > > > > > > > > Yes. Has the replication process completed ("rbd mirror image status > > > > CephTestPool1/vm-100-disk-0")? > > > > > > > > > Primary Site > > > > > root@Ccscephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-0 | grep TestSnapper1 > > > > > 10621 TestSnapper1 2 TiB Thu Jan 21 08:15:22 2021 user > > > > > > > > > > root@Ccscephtest1:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool1/vm-100-disk-0-CLONE > > > > > root@Ccscephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE --id admin --keyring /etc/ceph/ceph.client.admin.keyring > > > > > /dev/nbd0 > > > > > root@Ccscephtest1:~# mount /dev/nbd0 /usr2 > > > > > > > > > > Secondary Site > > > > > root@Bunkcephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-0 | grep TestSnapper1 > > > > > 10430 TestSnapper1 2 TiB Thu Jan 21 08:20:08 2021 user > > > > > > > > > > root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool1/vm-100-disk-0-CLONE > > > > > root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE --id admin --keyring /etc/ceph/ceph.client.admin.keyring > > > > > /dev/nbd0 > > > > > root@Bunkcephtest1:~# mount /dev/nbd0 /usr2 > > > > > mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, missing codepage or helper program, or other error. > > > > > > > > > > > > > > > > > > > > ________________________________ > > > > > From: "adamb" <adamb@xxxxxxxxxx> > > > > > To: "dillaman" <dillaman@xxxxxxxxxx> > > > > > Cc: "Eugen Block" <eblock@xxxxxx>, "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> > > > > > Sent: Wednesday, January 20, 2021 3:42:46 PM > > > > > Subject: Re: Re: RBD-Mirror Snapshot Backup Image Uses > > > > > > > > > > Awesome information. I new I had to be missing something. > > > > > > > > > > All of my clients will be far newer than mimic so I don't think that will be an issue. > > > > > > > > > > Added the following to my ceph.conf on both clusters. > > > > > > > > > > rbd_default_clone_format = 2 > > > > > > > > > > root@Bunkcephmon2:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool2/vm-100-disk-0-CLONE > > > > > root@Bunkcephmon2:~# rbd ls CephTestPool2 > > > > > vm-100-disk-0-CLONE > > > > > > > > > > I am sure I will be back with more questions. Hoping to replace our Nimble storage with Ceph and NVMe. > > > > > > > > > > Appreciate it! > > > > > > > > > > ________________________________ > > > > > From: "Jason Dillaman" <jdillama@xxxxxxxxxx> > > > > > To: "adamb" <adamb@xxxxxxxxxx> > > > > > Cc: "Eugen Block" <eblock@xxxxxx>, "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> > > > > > Sent: Wednesday, January 20, 2021 3:28:39 PM > > > > > Subject: Re: Re: RBD-Mirror Snapshot Backup Image Uses > > > > > > > > > > On Wed, Jan 20, 2021 at 3:10 PM Adam Boyhan <adamb@xxxxxxxxxx> wrote: > > > > > > > > > > > > That's what I though as well, specially based on this. > > > > > > > > > > > > > > > > > > > > > > > > Note > > > > > > > > > > > > You may clone a snapshot from one pool to an image in another pool. For example, you may maintain read-only images and snapshots as templates in one pool, and writeable clones in another pool. > > > > > > > > > > > > root@Bunkcephmon2:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool2/vm-100-disk-0-CLONE > > > > > > 2021-01-20T15:06:35.854-0500 7fb889ffb700 -1 librbd::image::CloneRequest: 0x55c7cf8417f0 validate_parent: parent snapshot must be protected > > > > > > > > > > > > root@Bunkcephmon2:~# rbd snap protect CephTestPool1/vm-100-disk-0@TestSnapper1 > > > > > > rbd: protecting snap failed: (30) Read-only file system > > > > > > > > > > You have two options: (1) protect the snapshot on the primary image so > > > > > that the protection status replicates or (2) utilize RBD clone v2 > > > > > which doesn't require protection but does require Mimic or later > > > > > clients [1]. > > > > > > > > > > > > > > > > > From: "Eugen Block" <eblock@xxxxxx> > > > > > > To: "adamb" <adamb@xxxxxxxxxx> > > > > > > Cc: "ceph-users" <ceph-users@xxxxxxx>, "Matt Wilder" <matt.wilder@xxxxxxxxxx> > > > > > > Sent: Wednesday, January 20, 2021 3:00:54 PM > > > > > > Subject: Re: Re: RBD-Mirror Snapshot Backup Image Uses > > > > > > > > > > > > But you should be able to clone the mirrored snapshot on the remote > > > > > > cluster even though it’s not protected, IIRC. > > > > > > > > > > > > > > > > > > Zitat von Adam Boyhan <adamb@xxxxxxxxxx>: > > > > > > > > > > > > > Two separate 4 node clusters with 10 OSD's in each node. Micron 9300 > > > > > > > NVMe's are the OSD drives. Heavily based on the Micron/Supermicro > > > > > > > white papers. > > > > > > > > > > > > > > When I attempt to protect the snapshot on a remote image, it errors > > > > > > > with read only. > > > > > > > > > > > > > > root@Bunkcephmon2:~# rbd snap protect > > > > > > > CephTestPool1/vm-100-disk-0@TestSnapper1 > > > > > > > rbd: protecting snap failed: (30) Read-only file system > > > > > > > _______________________________________________ > > > > > > > ceph-users mailing list -- ceph-users@xxxxxxx > > > > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > > > _______________________________________________ > > > > > > ceph-users mailing list -- ceph-users@xxxxxxx > > > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > > > > > > > [1] https://ceph.io/community/new-mimic-simplified-rbd-image-cloning/ > > > > > > > > > > -- > > > > > Jason > > > > > > > > > > > > > > > > > -- > > > > Jason > > > > > > > > > > > > -- > > > Jason > > > > > > > > -- > > Jason > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > -- > Jason -- Jason _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx