Copy locked parent and clones to another pool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

 

I created an erasure coded pool and thereafter created RBD images by specifying the ‘--data-pool’ parameter. I subsequently created locked snapshots and cloned them for systems I was setting up. After finishing I realised that I hadn’t specified the ‘—data-pool’ parameter when creating the clones, damn! Any changes on the clones were being stored directly in the ‘rbd_ssd’ pool, instead of the erasure coded ‘ec_ssd’ pool…

 

There were 4 systems with 3 disks each so I, for each cloned drive, renamed it, created a new one (using the ‘--data-pool’ switch this time) and then used some Perl that has been handy a whole bunch of times to only copy over 4MB chunks when the MD5 hash didn’t match between the source and destination block devices.

 

This way the source and destination images are 100% identical and any blocks that match the original parent are skipped.

 

 

PS: It would be nice to retrieve the crc values for the object store blocks, as this would avoid reading the full images to calculate the MD5 sum per block…

 

 

 

for ID in 211 212 213 214; do

  for f in 1 2 3; do

    rbd mv rbd_ssd/vm-$ID-disk-$f rbd_ssd/original-$ID-disk-$f;

    rbd clone rbd_ssd/base-210-disk-"$f"@__base__ rbd_ssd/vm-$ID-disk-"$f" --data-pool ec_ssd;

  done

done

rbd resize rbd_ssd/vm-213-disk-3 --size 50G;

rbd resize rbd_ssd/vm-214-disk-3 --size 1T;

for ID in 211 212 213 214; do

  for f in 1 2 3; do

    export dev1=`rbd map rbd_ssd/original-$ID-disk-$f --name client.admin -k /etc/pve/priv/ceph.client.admin.keyring`;

    export dev2=`rbd map rbd_ssd/vm-$ID-disk-$f --name client.admin -k /etc/pve/priv/ceph.client.admin.keyring`;

    perl -'MDigest::MD5 md5' -ne 'BEGIN{$/=\4194304};print md5($_)' $dev2 |

      perl -'MDigest::MD5 md5' -ne 'BEGIN{$/=\4194304};$b=md5($_);

        read STDIN,$a,16;if ($a eq $b) {print "s"} else {print "c" . $_}' $dev1 |

          perl -ne 'BEGIN{$/=\1} if ($_ eq"s") {$s++} else {if ($s) {

            seek STDOUT,$s*4194304,1; $s=0}; read ARGV,$buf,4194304; print $buf}' 1<> $dev2;

    rbd unmap $dev1;

    rbd unmap $dev2;

  done

done

 

 

# Compare amount of used space:

for ID in 211 212 213 172; do

  for f in 1 2 3; do

    echo -e "\nNAME                 PROVISIONED      USED";

    rbd du rbd_ssd/original-$ID-disk-"$f" 2> /dev/null | grep -P "^\S+disk-$f\s" | while read n a u; do printf "%-22s %9s %9s\n" $n $a $u; done;

    rbd du rbd_ssd/vm-$ID-disk-"$f" 2> /dev/null | grep -P "^\S+disk-$f\s" | while read n a u; do printf "%-22s %9s %9s\n" $n $a $u; done;

  done

done

 

Sample output:

NAME                 PROVISIONED      USED

original-211-disk-1        4400M    28672k

vm-211-disk-1              4400M    28672k

 

NAME                 PROVISIONED      USED

original-211-disk-2       30720M     6312M

vm-211-disk-2             30720M     6300M

 

NAME                 PROVISIONED      USED

original-211-disk-3       20480M     2092M

vm-211-disk-3             20480M     2088M

 

 

vm-211-disk-3 uses 4MB less data than original-211-disk-3 but validating the content of the images confirms that they are identical:

 

ID=211;

f=3;

export dev1=`rbd map rbd_ssd/original-$ID-disk-$f --name client.admin -k /etc/pve/priv/ceph.client.admin.keyring`;

export dev2=`rbd map rbd_ssd/vm-$ID-disk-$f --name client.admin -k /etc/pve/priv/ceph.client.admin.keyring`;

dd if=$dev1 bs=128M 2> /dev/null | sha1sum;

dd if=$dev2 bs=128M 2> /dev/null | sha1sum;

rbd unmap $dev1;

rbd unmap $dev2;

 

Output:

979ab34ea645ef6f16c3dbb5d3a78152018ea8e7  -

979ab34ea645ef6f16c3dbb5d3a78152018ea8e7  -

 

 

PS: qemu-img runs much faster than the perl nightmare above, as it knows which blocks contain data, BUT it copies data every time so using it with snapshot rotations results in each snapshot being the full source image data size. The perl method results in reading overhead (Ceph does however feed it zeros for unallocated blocks, which aren’t actually read from anywhere) so it’s much slower than qemu-img but exclusively copies blocks which are different.

 

 

 

The following may also be useful to others. It’s a relatively simple script to use the Perl method above to backup images from one pool to another. The script could easily be tweaked to use LVM snapshots as a destination and the method is compatible with any block device.

 

Notes:

  We have rbd_ssd/base-210-disk-X as a protected snapshot (clone parent) and then have 4 children where each VM has 3 disks. You would need to create the destination images and ensure that their size matches the source images as a prerequisite. The following script rotates 3 snapshots each time it runs and additionally creates a snapshot of the source images (not the static clone parent) before comparing block devices:

 

 

#!/bin/sh

 

src='';

dst='rbd_hdd';

 

rbdsnap () {

  [ "x" = "$1"x ] && return 1;

  [ `rbd snap ls $1 | grep -Pc "^\s+\d+\s+$2\s"` -gt 0 ] && return 0 || return 1;

}

 

# Backup 'template-debian-9.3' (clone parent) - Should never change so no need to maintain snapshots or run it on a continual basis:

#for ID in 210; do

#  for f in 1 2 3; do

#    echo -en "\t\t : Copying "$src"/base-"$ID"-disk-"$f"@__base__ to "$dst"/vm-"$ID"-disk-"$f"_backup";

#    qemu-img convert -f raw -O raw -t unsafe -T unsafe -nWp -S 4M rbd:"$src"/base-"$ID"-disk-"$f"@__base__ rbd:"$dst"/vm-"$ID"-disk-"$f"_backup;

#  done

#done

 

# Backup images (clone children):

for ID in 211 212 213 214; do

  for f in 1 2 3; do

    rbdsnap "$dst"/vm-"$ID"-disk-"$f"_backup snap3 && rbdsnap "$dst"/vm-"$ID"-disk-"$f"_backup snap2 && rbd snap rm "$dst"/vm-"$ID"-disk-"$f"_backup@snap3;

    rbdsnap "$dst"/vm-"$ID"-disk-"$f"_backup snap3 || rbdsnap "$dst"/vm-"$ID"-disk-"$f"_backup snap2 && rbd snap rename "$dst"/vm-"$ID"-disk-"$f"_backup@snap2 "$dst"/vm-"$ID"-disk-"$f"_backup@snap3;

    rbdsnap "$dst"/vm-"$ID"-disk-"$f"_backup snap2 || rbdsnap "$dst"/vm-"$ID"-disk-"$f"_backup snap1 && rbd snap rename "$dst"/vm-"$ID"-disk-"$f"_backup@snap1 "$dst"/vm-"$ID"-disk-"$f"_backup@snap2;

    rbdsnap "$dst"/vm-"$ID"-disk-"$f"_backup snap1 || rbd snap create "$dst"/vm-"$ID"-disk-"$f"_backup@snap1;

    rbd snap create "$src"/vm-"$ID"-disk-"$f"@backupinprogress;

  done

  for f in 1 2 3; do

    echo -en "\t\t : Copying "$src"/vm-"$ID"-disk-"$f" to "$dst"/vm-"$ID"-disk-"$f"_backup";

    #qemu-img convert -f raw -O raw -t unsafe -T unsafe -nWp -S 4M rbd:"$src"/vm-"$ID"-disk-"$f"@backupinprogress rbd:"$dst"/vm-"$ID"-disk-"$f"_backup;

    export dev1=`rbd map "$src"/vm-"$ID"-disk-"$f@backupinprogress" --name client.admin -k /etc/pve/priv/ceph.client.admin.keyring`;

    export dev2=`rbd map "$dst"/vm-"$ID"-disk-"$f"_backup --name client.admin -k /etc/pve/priv/ceph.client.admin.keyring`;

    perl -'MDigest::MD5 md5' -ne 'BEGIN{$/=\4194304};print md5($_)' $dev2 |

      perl -'MDigest::MD5 md5' -ne 'BEGIN{$/=\4194304};$b=md5($_);

        read STDIN,$a,16;if ($a eq $b) {print "s"} else {print "c" . $_}' $dev1 |

          perl -ne 'BEGIN{$/=\1} if ($_ eq"s") {$s++} else {if ($s) {

            seek STDOUT,$s*4194304,1; $s=0}; read ARGV,$buf,4194304; print $buf}' 1<> $dev2;

    rbd unmap $dev1;

    rbd unmap $dev2;

    rbd snap rm "$src"/vm-"$ID"-disk-"$f"@backupinprogress;

  done

done

 

 

 

Commenting out everything from ‘export dev1’ to ‘rbd unmap $dev2’ and uncommenting out the qemu-img command yields the following:

  real    0m48.598s

  user    0m14.583s

  sys     0m10.986s

[admin@kvm5a ~]# rbd du rbd_hdd/vm-211-disk-3_backup

NAME                       PROVISIONED   USED

vm-211-disk-3_backup@snap3      20480M  2764M

vm-211-disk-3_backup@snap2      20480M  2764M

vm-211-disk-3_backup@snap1      20480M  2764M

vm-211-disk-3_backup            20480M  2764M

<TOTAL>                         20480M 11056M

 

 

Repeating the copy using the Perl solution is much slower but as the VM is currently off nothing has changed and each snapshot consumes zero data:

  real    1m49.000s

  user    1m34.339s

 sys     0m17.847s

[admin@kvm5a ~]# rbd du rbd_hdd/vm-211-disk-3_backup

warning: fast-diff map is not enabled for vm-211-disk-3_backup. operation may be slow.

NAME                       PROVISIONED  USED

vm-211-disk-3_backup@snap3      20480M 2764M

vm-211-disk-3_backup@snap2      20480M     0

vm-211-disk-3_backup@snap1      20480M     0

vm-211-disk-3_backup            20480M     0

<TOTAL>                         20480M 2764M

 

 

PS: Not if this that is a Ceph display bug, why would the snapshot base be reported as not consuming any data and the first snapshot (rotated to ‘snap3’) report all the usage? Purging all snapshots yields the following:

[admin@kvm5a ~]# rbd du rbd_hdd/vm-211-disk-3_backup

warning: fast-diff map is not enabled for vm-211-disk-3_backup. operation may be slow.

NAME                 PROVISIONED  USED

vm-211-disk-3_backup      20480M 2764M

 

 

Regards

David Herselman

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux