Re: About the problem "export_diff relies on clone_overlap, which is lost when cache tier is enabled"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, everyone.

Trying to solve the issue "http://tracker.ceph.com/issues/20896";, I
just did another test: I did some writes to an object
"rbd_data.1ebc6238e1f29.0000000000000000" to raise its "HEAD" object
to the cache tier, after which I specifically write to its offset 0x40
with 4 bytes of random data. Then I used "ceph-objectstore-tool" to
dump its "HEAD" version in the base tier, the result is as
follows(before I raise it to cache tier, there is three snaps the
latest of which is 26):

{
    "id": {
        "oid": "rbd_data.1ebc6238e1f29.0000000000000000",
        "key": "",
        "snapid": -2,
        "hash": 1655893237,
        "max": 0,
        "pool": 3,
        "namespace": "",
        "max": 0
    },
    "info": {
        "oid": {
            "oid": "rbd_data.1ebc6238e1f29.0000000000000000",
            "key": "",
            "snapid": -2,
            "hash": 1655893237,
            "max": 0,
            "pool": 3,
            "namespace": ""
        },
        "version": "4219'16423",
        "prior_version": "3978'16310",
        "last_reqid": "osd.70.4213:2359",
        "user_version": 17205,
        "size": 4194304,
        "mtime": "2017-08-03 22:07:34.656122",
        "local_mtime": "2017-08-05 23:02:33.628734",
        "lost": 0,
        "flags": 52,
        "snaps": [],
        "truncate_seq": 0,
        "truncate_size": 0,
        "data_digest": 2822203961,
        "omap_digest": 4294967295,
        "watchers": {}
    },
    "stat": {
        "size": 4194304,
        "blksize": 4096,
        "blocks": 8200,
        "nlink": 1
    },
    "SnapSet": {
        "snap_context": {
            "seq": 26,
            "snaps": [
                26,
                25,
                16
            ]
        },
        "head_exists": 1,
        "clones": [
            {
                "snap": 16,
                "size": 4194304,
                "overlap": "[4~4194300]"
            },
            {
                "snap": 25,
                "size": 4194304,
                "overlap": "[]"
            },
            {
                "snap": 26,
                "size": 4194304,
                "overlap": "[]"
            }
        ]
    }
}

As we can see, its clone_overlap for snap 26 is empty, which,
combining with the previous test described in
http://tracker.ceph.com/issues/20896, means that the writes' "modified
range" is neither recorded in the cache tier nor in the base tier.

I think maybe we really should move the clone overlap modification out
of the IF block which has the condition check "is_present_clone". As
for now, I can't see any other way to fix this problem.

Am I right about this?

On 4 August 2017 at 03:14, Xuehan Xu <xxhdx1985126@xxxxxxxxx> wrote:
> I mean I think it's the condition check "is_present_clone" that
> prevent the clone overlap to record the client write operations
> modified range when the target "HEAD" object exists without its most
> recent clone object, and if I'm right, just move the clone overlap
> modification out of the "is_present_clone" condition check block is
> enough to solve this case, just like the PR
> "https://github.com/ceph/ceph/pull/16790";, and this fix wouldn't cause
> other problems.
>
> In our test, this fix solved the problem successfully, however, we
> can't confirm it won't cause new problems yet.
>
> So if anyone see this and knows the answer, please help us. Thank you:-)
>
> On 4 August 2017 at 11:41, Xuehan Xu <xxhdx1985126@xxxxxxxxx> wrote:
>> Hi, grep:-)
>>
>> I finally got what you mean in https://github.com/ceph/ceph/pull/16790.
>>
>> I agree with you in that " clone overlap is supposed to be tracking
>> which data is the same on disk".
>>
>> My thought is that, "ObjectContext::new_snapset.clones" is already an
>> indicator about whether there are clone objects on disk, so, in the
>> scenario of "cache tier", although a clone oid does not corresponds to
>> a "present clone" in cache tier, as long as
>> "ObjectContext::new_snapset.clones" is not empty, there must a one
>> such clone object in the base tier. And, as long as
>> "ObjectContext::new_snapset.clones" has a strict "one-to-one"
>> correspondence to "ObjectContext::new_snapset.clone_overlap", passing
>> the condition check "if (ctx->new_snapset.clones.size() > 0)" is
>> enough to make the judgement that the clone object exists.
>>
>> So, if I'm right, passing the condition check "if
>> (ctx->new_snapset.clones.size() > 0)" is already enough for us to do
>> "newest_overlap.subtract(ctx->modified_ranges)", it doesn't have to
>> pass "is_present_clone".
>>
>> Am I right about this? Or am I missing anything?
>>
>> Please help us, thank you:-)
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux