Re: RGW Replication

Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> · Mon, 03 Feb 2014 14:34:44 -0800

      On 2/3/14 10:51 , Gregory Farnum wrote:

      On Mon, Feb 3, 2014 at 10:43 AM, Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> wrote:

        I've been noticing somethings strange with my RGW federation.  I added some
statistics to radosgw-agent to try and get some insight
(https://github.com/ceph/radosgw-agent/pull/7), but that just showed me that
I don't understand how replication works.

When PUT traffic was relatively slow to the master zone, replication had no
issues keeping up.  Now I'm trying to cause replication to fall behind, by
deliberately exceeding the amount of bandwidth between the two zones
(they're in different datacenters).  Instead of falling behind, both the
radosgw-agent logs and the stats I added say that slave zone is keeping up.

But some of the numbers don't add up.  I'm not using enough bandwidth
between the two facilities, and I'm not using enough disk space in the slave
zone.  The disk usage in the slave zone continues to fall further and
further behind the master.  Despite this, I'm always able to download
objects from both zones.

How does radosgw-agent actually replicate metadata and data?  Does
radosgw-agent actually copy all the bytes, or does it create placeholders in
the slave zone?  If radosgw-agent is creating placeholders in the slave
zone, and radosgw populates the placeholder in the background, then that
would explain the behavior I'm seeing.  If this is true, how can I tell if
replication is keeping up?

      Are you overwriting the same objects? Replication copies over the
"present" version of an object, not all the versions which have ever
existed. Similarly, the slave zone doesn't keep all the
(garbage-collected) logs that the master zone has to, so those factors
would be one way to get differing disk counts.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

    Before I started this import, the master zone was using 3.54TB
    (raw), and the slave zone was using 3.42 TB (raw).  I did overwrite
    some objects, and the 120GB is plausible for overwrites.

    I haven't deleted anything yet, so the only garbage collection would
    be overwritten objects.  Right?

    I imported 1.93TB of data.  Replication is currently 2x, so that's
    3.86TB (raw).  Now the master is using 7.48TB (raw), and the slave
    is using 4.89TB (raw).  The master zone looks correct, but the slave
    zone is missing 2.59TB (raw).  That's 66% of my imported data.

    The 33% of data the slave does have is in line with the amount of
    bandwidth I see between the two facilities.  I see an increase of
    ~150 Mbps when the import is running on the master, and ~50 Mbps on
    the slave.

    Just to verify that I'm not over writing objects, I checked the
    apache logs.  Since I started the import, there have been 1328542
    PUTs (including normal site traffic).  1301511 of those are unique. 
    I'll investigate the 27031 duplicates, but the dups are only 34GB. 
    Not nearly enough to account for the discrepancy.

    From your answer, I'll assume there are no placeholders involved. 
    If radosgw-agent says we're up to date, the data should exist in the
    slave zone.

    Now I'm really confused.

     Craig Lewis 

     Senior Systems Engineer

      Office +1.714.602.1309

      Email clewis@xxxxxxxxxxxxxxxxxx

     Central Desktop. Work together
        in ways you never thought possible.  

         Connect with us   Website
           |  Twitter  |  Facebook  |  LinkedIn  |  Blog 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com