Re: rbd volume upgrades

Josh Durgin <josh.durgin@xxxxxxxxxxx> · Fri, 09 Nov 2012 12:34:41 -0800

On 11/09/2012 12:31 PM, Yehuda Sadeh wrote:
On Fri, Nov 9, 2012 at 12:30 PM, Yehuda Sadeh <yehuda@xxxxxxxxxxx> wrote:
On Fri, Nov 9, 2012 at 12:26 PM, Josh Durgin <josh.durgin@xxxxxxxxxxx> wrote:
On 11/09/2012 12:09 PM, Alex Elder wrote:

On 11/09/2012 02:03 PM, Josh Durgin wrote:

On 11/09/2012 11:44 AM, Yehuda Sadeh wrote:

On Fri, Nov 9, 2012 at 11:30 AM, Josh Durgin <josh.durgin@xxxxxxxxxxx>
wrote:

On 11/09/2012 11:08 AM, Yehuda Sadeh wrote:

On Fri, Nov 9, 2012 at 11:04 AM, Josh Durgin <josh.durgin@xxxxxxxxxxx>
wrote:

On 11/09/2012 11:01 AM, Gregory Farnum wrote:

I was asked today if there's a way to upgrade RBD volumes from v1 to
v2. I didn't think so, but wanted
1) to make sure I'm right,
2) to ask how hard it would be,
3) to ask if we haven't done it because it didn't occur to us or
because it's too hard.
-Greg

This was addressed in the original discussions about format 2.

You need to export and then import the volume as format 2. Format 2
uses
different names for objects, so providing an 'upgrade' path would
still
require copying all the data around.

Couldn't we just set a flag in the header specifying the object naming
version, which would then only require updating the header?

Yehuda

The header was separated from the id object to allow renames to work
while the image was in use or with cloning. The whole header format
changed and moved to a different object as a result. It would be
messy to implement this kind of upgrade, and doesn't provide much
benefit when there's an easy way to convert already. If someone really
wanted it, it could be implemented, but otherwise I don't think it's
worth adding. It would have to be added to the upcoming kernel
layering support too.

The assumption is that when you upgrade you don't go back, so the fact
that the header was separated from the id object doesn't change much.
An upgrade process would be the same as creating a new v2 image,
having object names (prefix?) that set as the original object names,
and with a version field that specifies that these are a v1 names.

The problem that I see with converting v1 to v2 through copy is that
(besides the cumbersome and potentially very long process) we will end
up turning sparse data objects into fully written data objects, which
will affect the data consumption.

That's a good point about export. It would be good to make export create
sparse files as well, but since it doesn't yet, the in-place upgrade
would be better for space usage.

Plus!  It looks like you don't even need a flag.

I think if you simply recorded the old-format object prefix in the
new format header, all would be fine.  The format of the object
id has not changed between v1 and v2, just the object prefix.

You still need a flag to tell whether there should be an 'rbd_data.' prefix
(format 2) or an 'rb.' prefix (format 1) before the object_prefix
stored in the header.

So maybe instead of having a format version it'll just be a string
that specifies either 'rb.' or 'rbd_.'?

that is 'rbd_data.'

Yeah, that would be easier to change later on. We'd just need to 
interpret lack of that setting as 'rbd_data.' to be compatible
with existing format 2 images.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html