Re: rgw multisite resync only one bucket

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Mon, Feb 27, 2017 at 11:40 AM, Marius Vaitiekunas <mariusvaitiekunas@xxxxxxxxx> wrote:


On Mon, Feb 27, 2017 at 9:59 AM, Marius Vaitiekunas <mariusvaitiekunas@xxxxxxxxx> wrote:


On Fri, Feb 24, 2017 at 6:35 PM, Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx> wrote:
On Fri, Feb 24, 2017 at 3:59 AM, Marius Vaitiekunas
<mariusvaitiekunas@xxxxxxxxx> wrote:
>
>
> On Wed, Feb 22, 2017 at 8:33 PM, Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx>
> wrote:
>>
>> On Wed, Feb 22, 2017 at 6:19 AM, Marius Vaitiekunas
>> <mariusvaitiekunas@xxxxxxxxx> wrote:
>> > Hi Cephers,
>> >
>> > We are testing rgw multisite solution between to DC. We have one
>> > zonegroup
>> > and to zones. At the moment all writes/deletes are done only to primary
>> > zone.
>> >
>> > Sometimes not all the objects are replicated.. We've written prometheus
>> > exporter to check replication status. It gives us each bucket object
>> > count
>> > from user perspective, because we have millions of objects and hundreds
>> > of
>> > buckets. We just want to be sure, that everything is replicated without
>> > using ceph internals like rgw admin api for now.
>> >
>> > Is it possible to initiate full resync of only one rgw bucket from
>> > master
>> > zone? What are the options about resync when things go wrong and
>> > replication
>> > misses some objects?
>> >
>> > We run latest jewel 10.2.5.
>>
>>
>> There's the 'radosgw-admin bucket sync init' command that you can run
>> on the specific bucket on the target zone. This will reinitialize the
>> sync state, so that when it starts syncing it will go through the
>> whole full sync process. Note that it shouldn't actually copy data
>> that already exists on the target. Also, in order to actually start
>> the sync, you'll need to have some change that would trigger the sync
>> on that bucket, e.g., create a new object there.
>>
>> Yehuda
>>
>
> Hi,
>
> I've tried to resync a bucket, but it didn't manage to resync a missing
> object. If I try to copy missing object by hand into secondary zone, i get
> asked to overwrite existing object.. It looks like the object is replicated,
> but is not in a bucket index. I've tried to check bucket index with --fix
> and --check-objects flags, but nothing changes. What else should i try?
>

That's weird. Do you see anything when you run 'radosgw-admin bi list
--bucket=<bucket>'?

Yehuda

'radosgw-admin bi list --bucket=<bucket>' gives me an error:
2017-02-27 08:55:30.861659 7f20c15779c0  0 error in read_id for id  : (2) No such file or directory
2017-02-27 08:55:30.861991 7f20c15779c0  0 error in read_id for id  : (2) No such file or directory
ERROR: bi_list(): (5) Input/output error

'radosgw-admin bucket list --bucket=<bucket>' successfully list all the files except missing ones.





I've done some more investigation. These missing objects could be found in "rgw.buckets.data" pool, but bucket index is not aware about them.
How does 'radosgw-admin bucket check -b <bucket> --fix --check-objects' works?  
I guess that it's not scanning "rgw.buckets.data" pool for "leaked" objects? These unreplicated objects looks for me the same like leaked ones :)


 
By the way in rgw logs I can find all the missing files with http 304 return code. For example:
"GET /go84/WRWRDGROWKFKROTWKHXXIBHERRLHBK HTTP/1.1" 304 0 - -

All the gateways in both sites are behind haproxies. Any ideas?


--
Marius Vaitiekūnas
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux