Using multisite to migrate data between bucket data pools.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a tangent on Paul Emmerich's response to " Correct Migration Workflow Replicated -> Erasure Code". I've tried Paul's method before to migrate between 2 data pools. However I ran into some issues.

The first issue seems like a bug in RGW where the RGW for the new zone was able to pull data directly from the data pool of the original zone after the metadata had been sync'd. The metadata seemed to realize the file actually exists and so it went ahead and grabbed it from the pool backing the other zone. I worked around that slightly by using cephx to specify which pools each RGW user could access, but it gives a permission denied error instead of a file not found error. This happens on buckets that are set not to replicate as well as buckets that failed to sync properly. Seems like a bit of a security threat, but not a super common situation at all.

The second issue I think has to do with corrupt index files in my index pool. Some of the buckets I don't need any more so I went to delete them for simplicity, but the command failed to delete them. I just set them aside for now and can just set the ones that I don't need any more to not replicate on the bucket level. That works for most things, but then I have a few buckets that I need to migrate, but when I set them to start replicating the data sync between zones gets stuck. Does anyone have any ideas on how to clean up the bucket indexes to make these operations possible?

At this point I've disabled multisite and cleared up the new zone so I can run operations on these buckets without dealing with multisite and replication. I've tried a few things and can get some additional information on my specific errors tomorrow at work.


---------- Forwarded message ---------
From: Paul Emmerich <paul.emmerich@xxxxxxxx>
Date: Wed, Oct 30, 2019 at 4:32 AM
Subject: Re: Correct Migration Workflow Replicated -> Erasure Code
To: Konstantin Shalygin <k0ste@xxxxxxxx>
Cc: Mac Wynkoop <mwynkoop@xxxxxxxxxxxx>, ceph-users <ceph-users@xxxxxxxx>


We've solved this off-list (because I already got access to the cluster)

For the list:

Copying on rados level is possible, but requires to shut down radosgw
to get a consistent copy. This wasn't feasible here due to the size
and performance.
We've instead added a second zone where the placement maps to an EC
pool to the zonegroup and it's currently copying over data. We'll then
make the second zone master and default and ultimately delete the
first one.
This allows for a migration without downtime.

Another possibility would be using a Transition lifecycle rule, but
that's not ideal because it doesn't actually change the bucket.

I don't think it would be too complicated to add a native bucket
migration mechanism that works similar to "bucket rewrite" (which is
intended for something similar but different).

Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux