Re: rgw-multisite: do we need an atomic option for RGWAsyncPutSystemObj?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 09/04/2018 11:18 PM, Xinying Song wrote:
Hi, Casey:

Our environment is based on luminous.

I'm a little confused about this commit. It adds a new member called
order_cr to RGWSyncShardMarkerTrack and order_cr will always hold the
newest udpate-marker cr. But suppose it has held an update-marker cr,
then a new update-marker cr arrived, it will reduce the ref count of
the older update-marker cr. Won't that 'put()' behavior lead to the
destroying of that old update-marker cr? That old update-marker cr may
be still in processing.
Despite the possible memory corruption, RADOS will still receive
multiple dis-ordered write operations, the problem seems to continue
exists.

Or did I missed some key points? Could you give some tips about this
fix? Thanks!

It looks like the magic happens in the while loop of RGWLastCallerWinsCR::operate(). The use of 'yield call()' there means that it won't resume until the spawned coroutine completes, so this prevents us from ever having more than one outstanding write to the marker.

If a second marker write comes in while the first is still running, it gets stored in 'cr' until the first call() completes.

If a third write comes in, it overwrites 'cr' and drops the reference to the second write. Since the second write hadn't been scheduled yet with call(), it's perfectly safe to drop the last ref and destroy it. If it -had- already been scheduled, then 'cr' was reset to nullptr before call(), and RGWLastCallerWinsCR::call_cr() won't try to drop its ref.

I hope that helps!
Casey

Casey Bodley <cbodley@xxxxxxxxxx> 于2018年9月4日周二 下午10:26写道:

On 09/03/2018 05:02 AM, Xinying Song wrote:
Hi, cephers:

We have been suffering a problem of rgw-multisite.  The `radosgw-admin
sync status` sometimes show data shards are behind to peers. If no
more log entries are added to the corresponding shard of peer zone,
i.e. no new writes, sync marker of this shard is stuck on that old
marker and no proceed. Restart rgw daemon can resolve this warning.

RGW log shows syncmarker in incremental_sync() function has been
updated to peer's newest marker. Gdb shows pending and finish_markers
variables of marker_tracker are empty. (i forget to see syncmarker
variable...) .

I guess this problem is caused by the non-atomic marker update. Since
update marker is handled by an RGWAsyncPutSystemObj op, those ops may
be dis-ordered when delivered to rados. Maybe we should add an id_tag
attr to ensure this op is atomic.

This problem is not easy to reproduce in testing enviroment, so I
prefer to ask you guys for some advice first, in case I'm in the wrong
way.

Thanks.
I think Yehuda saw this while testing the cloud sync work, and added a
RGWLastCallerWinsCR to guarantee the ordering of marker updates in
commit 1034a68fd12687ac81e6afc4718dbc8045648034. Does your branch
include that commit, or is it based on luminous? We won't be backporting
cloud sync as a feature, but we should probably take that one commit - I
opened a ticket for this backport at http://tracker.ceph.com/issues/35539.

Thanks,
Casey




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux