Re: why rgw generates large quantities orphan objects?

"Haas, Josh" <jhaas@xxxxxxxxxx> · Thu, 13 Oct 2022 15:44:35 +0000

Hi Liang,

My guess would be this bug:

https://tracker.ceph.com/issues/44660
https://www.spinics.net/lists/ceph-users/msg30151.html

It's actually existed for at least 6 years:
https://tracker.ceph.com/issues/16767

Which occurs any time you reupload the same *part* in a single Multipart Upload multiple times. For example, if my Multipart upload consists of 3 parts, if I upload part #2 twice, then the first upload of part #2 becomes orphaned.

If this was indeed the cause, you should have multiple "_multipart_" rados objects for the same part in "rados ls". For example, here's all the rados objects associated with a bugged bucket before I deleted it:

cc79b188-89d1-4f47-acb1-ab90513e9bc9.23325574.228__multipart_file.txt.4vkWzU4C5XLd2R6unFgbQ6aZM26vPuq8.1
cc79b188-89d1-4f47-acb1-ab90513e9bc9.23325574.228__multipart_file.txt.2~4zogSe4Ep0xvSC8j6aX71x_96cOgvQN.1
cc79b188-89d1-4f47-acb1-ab90513e9bc9.23325574.228__shadow_file.txt.4vkWzU4C5XLd2R6unFgbQ6aZM26vPuq8.1_1
cc79b188-89d1-4f47-acb1-ab90513e9bc9.23325574.228__shadow_file.txt.2~4zogSe4Ep0xvSC8j6aX71x_96cOgvQN.1_1

If we look at just these two:

cc79b188-89d1-4f47-acb1-ab90513e9bc9.23325574.228__multipart_file.txt.4vkWzU4C5XLd2R6unFgbQ6aZM26vPuq8.1
cc79b188-89d1-4f47-acb1-ab90513e9bc9.23325574.228__multipart_file.txt.2~4zogSe4Ep0xvSC8j6aX71x_96cOgvQN.1

They are in the format:

$BUCKETID__multipart_$S3KEY.$PARTUID.$PARTNUM

Because everything matches ($BUCKETID, $S3KEY, $PARTNUM) except for $PARTUID, this S3 object has been affected by the bug. If you find instances of rados keys that match on everything except $PARTUID, then this bug is probably the cause.

Josh

________________________________
From: 郑亮 <zhengliang0901@xxxxxxxxx>
Sent: Wednesday, October 12, 2022 1:34:31 AM
To: ceph-users@xxxxxxx
Subject:  why rgw generates large quantities orphan objects?

Hi all,
Description of problem: [RGW] Buckets/objects deletion is causing large
quantities orphan raods objects

The cluster was running a cosbench workload, then remove the partial data
by deleting objects from the cosbench client, then we have deleted all the
buckets with the help of `s3cmd rb --recursive --force` command that
removed all the buckets, but that did not help in the space reclaimation.

```
[root@node01 /]# rgw-orphan-list

Available pools:

    device_health_metrics

    .rgw.root

    os-test.rgw.buckets.non-ec

    os-test.rgw.log

    os-test.rgw.control

    os-test.rgw.buckets.index

    os-test.rgw.meta

    os-test.rgw.buckets.data

    deeproute-replica-hdd-pool

    deeproute-replica-ssd-pool

    cephfs-metadata

    cephfs-replicated-pool

    .nfs

Which pool do you want to search for orphans (for multiple, use
space-separated list)? os-test.rgw.buckets.data

Pool is "os-test.rgw.buckets.data".

Note: output files produced will be tagged with the current timestamp --
20221008062356.
running 'rados ls' at Sat Oct  8 06:24:05 UTC 2022

running 'rados ls' on pool os-test.rgw.buckets.data.

running 'radosgw-admin bucket radoslist' at Sat Oct  8 06:43:21 UTC 2022
computing delta at Sat Oct  8 06:47:17 UTC 2022

39662551 potential orphans found out of a possible 39844453 (99%).
The results can be found in './orphan-list-20221008062356.out'.

    Intermediate files are './rados-20221008062356.intermediate' and
'./radosgw-admin-20221008062356.intermediate'.
***

*** WARNING: This is EXPERIMENTAL code and the results should be used
***          only with CAUTION!
***
Done at Sat Oct  8 06:48:07 UTC 2022.

[root@node01 /]# radosgw-admin gc list
[]

[root@node01 /]# cat orphan-list-20221008062356.out | wc -l
39662551

[root@node01 /]# rados df
POOL_NAME                       USED   OBJECTS  CLONES     COPIES
 MISSING_ON_PRIMARY  UNFOUND  DEGRADED     RD_OPS       RD     WR_OPS
 WR  USED COMPR  UNDER COMPR
.nfs                         4.3 MiB         4       0         12
        0        0         0      77398   76 MiB        146   79
KiB         0 B          0 B
.rgw.root                    180 KiB        16       0         48
        0        0         0      28749   28 MiB          0
0 B         0 B          0 B
cephfs-metadata              932 MiB     14772       0      44316
        0        0         0    1569690  3.8 GiB    1258651  3.4
GiB         0 B          0 B
cephfs-replicated-pool       738 GiB    300962       0     902886
        0        0         0     794612  470 GiB     770689  245
GiB         0 B          0 B
deeproute-replica-hdd-pool  1016 GiB    104276       0     312828
        0        0         0   18176216  298 GiB  441783780  6.7
TiB         0 B          0 B
deeproute-replica-ssd-pool    30 GiB      3691       0      11073
        0        0         0    2466079  2.1 GiB    8416232  221
GiB         0 B          0 B
device_health_metrics         50 MiB       108       0        324
        0        0         0       1836  1.8 MiB       1944   18
MiB         0 B          0 B
os-test.rgw.buckets.data     5.6 TiB  39844453       0  239066718
        0        0         0  552896177  3.0 TiB  999441015   60
TiB         0 B          0 B
os-test.rgw.buckets.index    1.8 GiB        33       0         99
        0        0         0  153600295  154 GiB  110916573   62
GiB         0 B          0 B
os-test.rgw.buckets.non-ec   2.1 MiB        45       0        135
        0        0         0     574240  349 MiB     153725  139
MiB         0 B          0 B
os-test.rgw.control              0 B         8       0         24
        0        0         0          0      0 B          0
0 B         0 B          0 B
os-test.rgw.log              3.7 MiB       346       0       1038
        0        0         0   83877803   80 GiB    6306730  7.6
GiB         0 B          0 B
os-test.rgw.meta             220 KiB        23       0         69
        0        0         0     640854  506 MiB     108229   53
MiB         0 B          0 B

total_objects    40268737
total_used       7.8 TiB
total_avail      1.1 PiB
total_space      1.1 PiB
```
ceph verison:
```
[root@node01 /]# ceph versions
{
    "mon": {
        "ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
pacific (stable)": 3
    },
    "mgr": {
        "ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
pacific (stable)": 2
    },
    "osd": {
        "ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
pacific (stable)": 108
    },
    "mds": {
        "ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
pacific (stable)": 2
    },
    "rgw": {
        "ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
pacific (stable)": 9
    },
    "overall": {
        "ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
pacific (stable)": 124
    }
}
```

Thanks,
Best regards
Liang Zheng
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx