On 12/23/2015 10:06 AM, Wido den Hollander wrote: > This allows user to use the volume wiping functionality of the libvirt > storage driver. > > This patch also adds a new wiping algorithm VIR_STORAGE_VOL_WIPE_ALG_DISCARD > > By default the VIR_STORAGE_VOL_WIPE_ALG_ZERO algorithm is used and with > RBD this will called rbd_write() in chunks of the underlying object size > to completely zero out the volume. > > With VIR_STORAGE_VOL_WIPE_ALG_DISCARD it will call rbd_discard() in the > same object size chunks which will trim/discard all underlying RADOS objects > in the Ceph cluster. > > Signed-off-by: Wido den Hollander <wido@xxxxxxxxx> > --- > include/libvirt/libvirt-storage.h | 4 + > src/storage/storage_backend_rbd.c | 155 +++++++++++++++++++++++++++++++++++++- > tools/virsh-volume.c | 2 +- > 3 files changed, 159 insertions(+), 2 deletions(-) > Found these buried in my todo list of things to look at from during the holiday break. I figure by bumping it'll bring it back into focus... "Semantically" speaking - this patch is a v2 of the original patch series... I'm still a bit conflicted whether to add a new option to Wipe or whether a new API should be developed. I see value in both options. Although perhaps thinking of this as "trim" and not "discard" could make it more palatable for wipe. As a new API, each backend driver could decide whether it supports the discard/trim option, but that's quite a bit more work (essentially mimic the Wipe functionality, but generate Trim). I'll note off the top that if we go with adding a new wipe algorithm and we've updated virsh-volume.c to recognize that, then virsh.pod would also need an update to describe it. Also rather than one patch here - I suggest smaller individual patches to make it easier to debug issues down the line when using git bisect. I see perhaps 4 patches... Patch 1: You probably want to start by adjusting virStorageBackendVolWipeLocal. In particular, the switch statement there needs some tweaking - first to use the "switch ((virStorageVolWipeAlgorithm) algorithm) {" construct, but also fixing a 'bug' I just noted in the current design. If the current 'default:' option is taken, the code reports an error, but still attempts the SCRUB command (which will return/cause a different error). BTW: Instead of default it would the *_LAST case... If you're really ambitious, adding a check for the "expected" 'flags' bits would also be beneficial especially since you'll be adding one. Patch 2: Add wipe support for rbd and add the Zero algorithm. This gives a base. The switch in virStorageBackendRBDVolWipe could still remain, but the Flags would only be for _ZERO Patch 3: Then add a the 'trim' option to libvirt-storage.h, virsh-volume.c, and virsh.pod... Patch 4: This patch would add the 'trim' support to the backend. Also grab virStorageBackendRBDImageInfo from patch 2. You're making the same 'stripe_count' call in this patch, but don't have the same checks. If you're concerned about the perhaps extra unnecessary calls you could allow the 3 return parameters to be NULL, then prior to fetching do "if (param)" type trick. The caller could then provide a NULL if they don't care about features and unit... > diff --git a/include/libvirt/libvirt-storage.h b/include/libvirt/libvirt-storage.h > index 2c55c93..139add3 100644 > --- a/include/libvirt/libvirt-storage.h > +++ b/include/libvirt/libvirt-storage.h > @@ -153,6 +153,10 @@ typedef enum { > > VIR_STORAGE_VOL_WIPE_ALG_RANDOM = 8, /* 1-pass random */ > > + VIR_STORAGE_VOL_WIPE_ALG_DISCARD = 9, /* 1-pass, discard all data on the > + volume by using TRIM or > + DISCARD */ Assuming we use wipe, I think "TRIM" with the description of the option to be "trimming" the contents of the volume. Whether that's sparse files, thin/sparse logical volumes, or rbd object discarding... The aren't your problem to solve here, unless you have that desire to make those changes too. Also, the 2nd/3rd comments should line up under 1-pass... > + > # ifdef VIR_ENUM_SENTINELS > VIR_STORAGE_VOL_WIPE_ALG_LAST > /* > diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c > index cdbfdee..d13658d 100644 > --- a/src/storage/storage_backend_rbd.c > +++ b/src/storage/storage_backend_rbd.c > @@ -32,6 +32,7 @@ > #include "base64.h" > #include "viruuid.h" > #include "virstring.h" > +#include "virutil.h" This isn't necessary I believe. I was able to remove without issue. > #include "rados/librados.h" > #include "rbd/librbd.h" > > @@ -700,6 +701,157 @@ static int virStorageBackendRBDResizeVol(virConnectPtr conn ATTRIBUTE_UNUSED, > return ret; > } > > +static int virStorageBackendRBDVolWipeZero(rbd_image_t image, > + char *imgname, > + rbd_image_info_t info, > + uint64_t stripe_count) Newer libvirt convention is: static int virStorage... > +{ > + int r = -1; Add: int ret = -1; Usually it's 'ret' instead of just 'r'... Keeping 'r' for rbd_*() call failures fine though since that will contain (and possibly message) rbd_* specific API call errors... > + size_t offset = 0; > + uint64_t length; > + char *writebuf; > + > + if (VIR_ALLOC_N(writebuf, info.obj_size * stripe_count) < 0) > + goto cleanup; > + > + while (offset < info.size) { > + length = MIN((info.size - offset), (info.obj_size * stripe_count)); > + > + r = rbd_write(image, offset, length, writebuf); > + if (r < 0) { > + virReportSystemError(-r, _("writing %llu bytes failed on " > + " RBD image %s at offset %llu"), This will generate two spaces "... failed on RBD image..." > + (unsigned long long)length, > + imgname, > + (unsigned long long)offset); So is length a "uint64_t" or not? I do note that librdb.h deems it a "size_t"... The query is more why caste to (unsigned long long) other than the %llu (of course). As for offset, IIRC the convention is "%zu", although for this one I note that the librdb.h deems it a "uint64_t". > + goto cleanup; > + } > + > + VIR_DEBUG("Wrote %llu bytes to RBD image %s at offset %llu", > + (unsigned long long)length, > + imgname, (unsigned long long)offset); similar comments regarding the castes and the variable types. > + > + offset += length; > + } Here would be: ret = 0; > + > + cleanup: writebuf is leaked. Need a VIR_FREE() > + return r; and this becomes return ret; > +} > + > +static int virStorageBackendRBDVolWipeDiscard(rbd_image_t image, > + char *imgname, > + rbd_image_info_t info, > + uint64_t stripe_count) static int virStorage... > +{ > + int r = -1; Need int ret = -1 > + size_t offset = 0; > + uint64_t length; > + > + VIR_DEBUG("Wiping RBD %s volume using discard)", imgname); > + > + while (offset < info.size) { > + length = MIN((info.size - offset), (info.obj_size * stripe_count)); > + > + r = rbd_discard(image, offset, length); rbd_discard deems 'offset' to also be a uint64_t > + if (r < 0) { > + virReportSystemError(-r, _("discarding %llu bytes failed on " > + " RBD image %s at offset %llu"), similar to *Zero - you'll have "...failed on RBD image..." > + (unsigned long long)length, > + imgname, > + (unsigned long long)offset); similar comments regarding caste's of length and offset > + goto cleanup; > + } > + > + VIR_DEBUG("Discarded %llu bytes of RBD image %s at offset %llu", > + (unsigned long long)length, > + imgname, (unsigned long long)offset); similar comments regarding caste's > + > + offset += length; > + } Here would be ret = 0; > + > + cleanup: > + return r; And return ret; > +} > + > +static int virStorageBackendRBDVolWipe(virConnectPtr conn, > + virStoragePoolObjPtr pool, > + virStorageVolDefPtr vol, > + unsigned int algorithm, > + unsigned int flags) static int virStorage... > +{ > + virStorageBackendRBDState ptr; > + ptr.cluster = NULL; > + ptr.ioctx = NULL; > + rbd_image_t image = NULL; > + rbd_image_info_t info; > + uint64_t stripe_count; > + int r = -1; Add int ret = -1; > + > + virCheckFlags(VIR_STORAGE_VOL_WIPE_ALG_ZERO | > + VIR_STORAGE_VOL_WIPE_ALG_DISCARD, -1); > + > + VIR_DEBUG("Wiping RBD image %s/%s", pool->def->source.name, vol->name); > + > + if (virStorageBackendRBDOpenRADOSConn(&ptr, conn, &pool->def->source) < 0) > + goto cleanup; > + > + if (virStorageBackendRBDOpenIoCTX(&ptr, pool) < 0) > + goto cleanup; > + > + r = rbd_open(ptr.ioctx, vol->name, &image, NULL); > + if (r < 0) { BTW: This can be : if ((r = rbd_open(ptr.ioctx, vol->name, &image, NULL)) < 0) { For this and all rbd_* calls... > + virReportSystemError(-r, _("failed to open the RBD image %s"), > + vol->name); > + goto cleanup; > + } > + > + r = rbd_stat(image, &info, sizeof(info)); > + if (r < 0) { > + virReportSystemError(-r, _("failed to stat the RBD image %s"), > + vol->name); > + goto cleanup; > + } > + > + r = rbd_get_stripe_count(image, &stripe_count); > + if (r < 0) { > + virReportSystemError(-r, _("failed to get stripe count of RBD image %s"), > + vol->name); > + goto cleanup; > + } I see the subsequent patch has some extra checks before calling this. Why wouldn't those also need to be made here? > + > + VIR_DEBUG("Need to wipe %llu bytes from RBD image %s/%s", > + (unsigned long long)info.size, pool->def->source.name, vol->name); > + > + switch (algorithm) { Follow the convention of "switch ((virStorageVolWipeAlgorithm) algorithm) {" Then each "case" lines up under "switch". > + case VIR_STORAGE_VOL_WIPE_ALG_ZERO: > + r = virStorageBackendRBDVolWipeZero(image, vol->name, > + info, stripe_count); I would change this (and the next one) to: if (virStorageBackendRBDVolWipeZero(image, vol->name, info, stripe_count) < 0) goto cleanup; Also, I ran these patches through Coverity - it complains that 'info' is passed by value of 160 bytes... Although neither API adjusts it, why not just pass "info.size" and "info.obj_size" or pass by reference the whole 'info' (just to be safe). > + break; > + case VIR_STORAGE_VOL_WIPE_ALG_DISCARD: > + r = virStorageBackendRBDVolWipeDiscard(image, vol->name, > + info, stripe_count); > + break; > + default: And listing each case allowed - so it's clearer. That way if someone in the future comes along and adds ALG_ONE, the rbd code isn't forgotten to be adjusted... The compiler catches it. > + virReportError(VIR_ERR_INVALID_ARG, _("unsupported algorithm %d"), > + algorithm); > + r = -VIR_ERR_INVALID_ARG; This will be unnecessary... > + goto cleanup; > + } > + > + if (r < 0) { > + virReportSystemError(-r, _("failed to wipe RBD image %s"), > + vol->name); This overwrites the errors found in the *WipeZero and *WipeDiscard API's > + goto cleanup; > + } The assumption here being ret = 0; > + > + cleanup: > + if (image) > + rbd_close(image); > + > + virStorageBackendRBDCloseRADOSConn(&ptr); > + return r; return ret; > +} > + > virStorageBackend virStorageBackendRBD = { > .type = VIR_STORAGE_POOL_RBD, > > @@ -708,5 +860,6 @@ virStorageBackend virStorageBackendRBD = { > .buildVol = virStorageBackendRBDBuildVol, > .refreshVol = virStorageBackendRBDRefreshVol, > .deleteVol = virStorageBackendRBDDeleteVol, > - .resizeVol = virStorageBackendRBDResizeVol, > + .wipeVol = virStorageBackendRBDVolWipe, > + .resizeVol = virStorageBackendRBDResizeVol No need to remove the "," - that way the only diff is the line. > }; > diff --git a/tools/virsh-volume.c b/tools/virsh-volume.c > index 7932ef2..3e95aa5 100644 > --- a/tools/virsh-volume.c > +++ b/tools/virsh-volume.c > @@ -954,7 +954,7 @@ static const vshCmdOptDef opts_vol_wipe[] = { > VIR_ENUM_DECL(virStorageVolWipeAlgorithm) > VIR_ENUM_IMPL(virStorageVolWipeAlgorithm, VIR_STORAGE_VOL_WIPE_ALG_LAST, > "zero", "nnsa", "dod", "bsi", "gutmann", "schneier", > - "pfitzner7", "pfitzner33", "random"); > + "pfitzner7", "pfitzner33", "random", "discard"); I think "trim" will be better. John > > static bool > cmdVolWipe(vshControl *ctl, const vshCmd *cmd) > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list