Re: radosgw: stale/leaked bucket index entries

"Kamble, Nitin A" <Nitin.Kamble@xxxxxxxxxxxx> · Wed, 9 Aug 2017 16:22:48 +0000

Great to know the update. It was worth sending email to ML.

Thanks,
Nitin

On 8/8/17, 8:07 PM, "Pavan Rallabhandi" <PRallabhandi@xxxxxxxxxxxxxxx> wrote:

    Yes, I have given the tracker details in the users ML, here it is http://tracker.ceph.com/issues/20380

    Thanks,
    -Pavan.

    On 09/08/17, 4:49 AM, "ceph-devel-owner@xxxxxxxxxxxxxxx on behalf of Matt Benjamin" <ceph-devel-owner@xxxxxxxxxxxxxxx on behalf of mbenjami@xxxxxxxxxx> wrote:

        Hi Nitin,

        It's pending-backport.

        Matt

        On Tue, Aug 8, 2017 at 6:37 PM, Kamble, Nitin A
        <Nitin.Kamble@xxxxxxxxxxxx> wrote:
        > Good you have brought this issue. I also have seen the stale index entries lingering after objection deletion for rgw. It would be nice to see this issue captured and addressed in a bug.
        >
        > Nitin
        >
        >
        > On 6/19/17, 10:37 AM, "ceph-devel-owner@xxxxxxxxxxxxxxx on behalf of Pavan Rallabhandi" <ceph-devel-owner@xxxxxxxxxxxxxxx on behalf of PRallabhandi@xxxxxxxxxxxxxxx> wrote:
        >
        >     On many of our clusters running Jewel (10.2.5+), am running into a strange problem of having stale bucket index entries left over for (some of the) objects deleted. Though it is not reproducible at will, it has been pretty consistent of late and am clueless at this point for the possible reasons to happen so.
        >
        >     The symptoms are that the actual delete operation of an object is reported successful in the RGW logs, but a bucket list on the container would still show the deleted object. An attempt to download/stat of the object appropriately results in a failure. No failures are seen in the respective OSDs where the bucket index object is located. And rebuilding the bucket index by running ‘radosgw-admin bucket check –fix’ would fix the issue.
        >
        >     Though I could simulate the problem by instrumenting the code, to not to have invoked `complete_del` on the bucket index op https://github.com/ceph/ceph/blob/master/src/rgw/rgw_rados.cc#L8793, but that call is always seem to be made unless there is a cascading error from the actual delete operation of the object, which doesn’t seem to be the case here.
        >
        >     I wanted to know the possible reasons where the bucket index would be left in such limbo, any pointers would be much appreciated. FWIW, we are not sharding the buckets and very recently I’ve seen this happen with buckets having as low as
        >     < 10 objects, and we are using swift for all the operations.
        >
        >     Thanks,
        >     -Pavan.
        >
        >     ?�{.n�+�������+%��lzwm��b�맲��r��yǩ�ׯzX��?��ܨ}���Ơz�&j:+v���?����zZ+��+zf���h���~����i���z�?�w���?����&�)ߢ?f
        >
        --
        To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
        the body of a message to majordomo@xxxxxxxxxxxxxxx
        More majordomo info at  http://vger.kernel.org/majordomo-info.html

��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f