Re: Upgraded to Quincy 17.2.7: some S3 buckets inaccessible

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 3, 2024 at 11:58 AM Lorenz Bausch <info@xxxxxxxxxxxxxxx> wrote:
>
> Hi everybody,
>
> we upgraded our containerized Red Hat Pacific cluster to the latest
> Quincy release (Community Edition).

i'm afraid this is not an upgrade path that we try to test or support.
Red Hat makes its own decisions about what to backport into its
releases. my understanding is that Red Hat's pacific-based 5.3 release
includes all of the rgw multisite resharding changes which were not
introduced upstream until the Reef release. this includes changes to
data formats that an upstream Quincy release would not understand. in
this case, you might have more luck upgrading to Reef?

> The upgrade itself went fine, the cluster is HEALTH_OK, all daemons run
> the upgraded version:
>
> ---- %< ----
> $ ceph -s
>    cluster:
>      id:     68675a58-cf09-4ebd-949c-b9fcc4f2264e
>      health: HEALTH_OK
>
>    services:
>      mon: 5 daemons, quorum node02,node03,node04,node05,node01 (age 25h)
>      mgr: node03.ztlair(active, since 25h), standbys: node01.koymku,
> node04.uvxgvp, node02.znqnhg, node05.iifmpc
>      osd: 408 osds: 408 up (since 22h), 408 in (since 7d)
>      rgw: 19 daemons active (19 hosts, 1 zones)
>
>    data:
>      pools:   11 pools, 8481 pgs
>      objects: 236.99M objects, 544 TiB
>      usage:   1.6 PiB used, 838 TiB / 2.4 PiB avail
>      pgs:     8385 active+clean
>               79   active+clean+scrubbing+deep
>               17   active+clean+scrubbing
>
>    io:
>      client:   42 MiB/s rd, 439 MiB/s wr, 2.15k op/s rd, 1.64k op/s wr
>
> ---
>
> $ ceph versions | jq .overall
> {
>    "ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
> (stable)": 437
> }
> ---- >% ----
>
> After all the daemons were upgraded we started noticing some RGW buckets
> which are inaccessible.
> s3cmd failed with NoSuchKey:
>
> ---- %< ----
> $ s3cmd la -l
> ERROR: S3 error: 404 (NoSuchKey)
> ---- >% ----
>
> The buckets still exists according to "radosgw-admin bucket list".
> Out of the ~600 buckets, 13 buckets are unaccessible at the moment:
>
> ---- %< ----
> $ radosgw-admin bucket radoslist --tenant xy --uid xy --bucket xy
> 2024-04-03T12:13:40.607+0200 7f0dbf4c4680  0 int
> RGWRados::cls_bucket_list_ordered(const DoutPrefixProvider*,
> RGWBucketInfo&, int, const rgw_obj_index_key&, const string&, const
> string&, uint32_t, bool, uint16_t, RGWRados::ent_map_t&, bool*, bool*,
> rgw_obj_index_key*, optional_yield, RGWBucketListNameFilter):
> CLSRGWIssueBucketList for
> xy:xy[6955f50e-5b23-4534-9b77-c7078f60f0d0.171713434.3]) failed
> 2024-04-03T12:13:40.609+0200 7f0dbf4c4680  0 int
> RGWRados::cls_bucket_list_ordered(const DoutPrefixProvider*,
> RGWBucketInfo&, int, const rgw_obj_index_key&, const string&, const
> string&, uint32_t, bool, uint16_t, RGWRados::ent_map_t&, bool*, bool*,
> rgw_obj_index_key*, optional_yield, RGWBucketListNameFilter):
> CLSRGWIssueBucketList for
> xy:xy[6955f50e-5b23-4534-9b77-c7078f60f0d0.171713434.3]) failed
> ---- >% ----
>
> The affected buckets are comparatively large, around 4 - 7 TB,
> but not all buckets of that size are affected.
>
> Using "rados -p rgw.buckets.data ls" it seems like all the objects are
> still there,
> although "rados -p rgw.buckets.data get objectname -" only prints
> unusable (?) binary data,
> even for objects of intact buckets.
>
> Overall we're facing around 60 TB of customer data which are just gone
> at the moment.
> Is there a way to recover from this situation or further narrowing down
> the root cause of the problem?
>
> Kind regards,
> Lorenz
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux