Re: Ceph Octopus RGW 15.2.17 - files not available in rados while still in bucket index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I just checked something else and it looks like this problem happens when
our SSD OSDs get marked as laggy, because of the GC bug:
https://tracker.ceph.com/issues/53585

:2022-08-18T22:00:12.257+0000 7fb9dbe62700  0 log_channel(cluster) log
[INF] : osd.263 marked itself dead as of e658014
:2022-08-18T22:01:48.727+0000 7fb9dbe62700  0 log_channel(cluster) log
[INF] : osd.242 marked itself dead as of e658018
:2022-08-18T22:03:07.898+0000 7fb9dbe62700  0 log_channel(cluster) log
[INF] : osd.263 marked itself dead as of e658023
:2022-08-18T22:10:54.963+0000 7fb9dbe62700  0 log_channel(cluster) log
[INF] : osd.242 marked itself dead as of e658028

Out s3 cluster us also used for our backup center which got RBD exports
from our rbd clusters (which are usually multiple GB/TB in size).
We added some SSD OSDs and put all of our non-data pools on these SSD OSDs.

This helped to leverage some pressure from the cluster, when the GC goes
nuts. Maybe this happens together.

Am So., 21. Aug. 2022 um 19:34 Uhr schrieb Boris Behrens <bb@xxxxxxxxx>:

> Cheers everybody,
>
> I had this issue some time ago, and we though it was fixed, but it seems
> to happen again.
> We have files, that get uploaded by one of our customer, only available in
> the index, but not in the rados.
>
> At first we thought this might be a bug (
> https://tracker.ceph.com/issues/54528) which got fixed with the last
> pointrelease, but it seems not. And only on customer got this problem. At
> the moment we thing it is some very weird usage of the s3 API (they
> developed their own library and used the AWS SDK for .net as a basis)
> together with multipart uploads.
>
> I am also not sure HOW they do the upload, because it is a backup that get
> uploaded every day and they seem to have multiple of them. I didn't went
> through all of our logs, but I managed to pull one lifecycle of a file from
> the logs and it showed very strange errors at the end and I couldn't find
> anything with this error.
>
> Hope someone can tell me what this is and how I can fix it.
>
> Cheers
>  Boris
>
> Strange errors:
> 2022-08-18T22:04:29.538+0000 7f7ba9fcb700  0 req 9033182355071581504
> 183.407425780s s3:complete_multipart WARNING: failed to remove object
> sql-backup-de:_multipart_IM_DIFFERENTIAL_22.bak.2~ehGVVRPV3LnWW31bRmBEcOHSKB_zJAs.meta
> 2022-08-18T22:04:29.542+0000 7f7ba9fcb700  0 req 9033182355071581504
> 183.411425768s s3:complete_multipart WARNING: failed to unlock
> CLUSTERUUID.BUCKET.INDENTIFIER__multipart_IM_DIFFERENTIAL_22.bak.2~ehGVVRPV3LnWW31bRmBEcOHSKB_zJAs.meta
>
> Full log (trimmed when only partNumber changed):
> 2022-08-18T22:01:08.894838+0000 "GET
> /sql-backup-sde/IM_DIFFERENTIAL_22.bak HTTP/1.1" 200 315392 -
> "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar Botocore/1.27.23" -
> 2022-08-18T22:01:08.930838+0000 "POST
> /sql-backup-sde/IM_DIFFERENTIAL_22.bak?uploads HTTP/1.1" 200 271 -
> "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar Botocore/1.27.23" -
> 2022-08-18T22:01:09.108374+0000 "POST
> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploads HTTP/1.1" 200 270 -
> "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar Botocore/1.27.23" -
> 2022-08-18T22:01:09.472368+0000 "PUT
> /sql-backup-sde/IM_DIFFERENTIAL_22.bak?uploadId=2~KX75VPCYFOZRPRLo5L0ytQuyp-nzbrT&partNumber=4
> HTTP/1.1" 200 2523136 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> ..
> 2022-08-18T22:01:09.619099+0000 "PUT
> /sql-backup-sde/IM_DIFFERENTIAL_22.bak?uploadId=2~KX75VPCYFOZRPRLo5L0ytQuyp-nzbrT&partNumber=2
> HTTP/1.1" 200 8388608 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> 2022-08-18T22:01:09.706836+0000 "POST
> /sql-backup-sde/IM_DIFFERENTIAL_22.bak?uploadId=2~KX75VPCYFOZRPRLo5L0ytQuyp-nzbrT
> HTTP/1.1" 200 334 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> 2022-08-18T22:01:09.852362+0000 "PUT
> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEcOHSKB_zJAs&partNumber=1
> HTTP/1.1" 200 8388608 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> ..
> 2022-08-18T22:01:26.098900+0000 "PUT
> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEcOHSKB_zJAs&partNumber=161
> HTTP/1.1" 200 8388608 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> 2022-08-18T22:02:14.103386+0000 "GET /sql-backup-de/IM_DIFFERENTIAL_22.bak
> HTTP/1.1" 200 4194304 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> 2022-08-18T22:02:26.275201+0000 "POST
> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEcOHSKB_zJAs
> HTTP/1.1" 500 304 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> 2022-08-18T22:02:27.787178+0000 "POST
> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEcOHSKB_zJAs
> HTTP/1.1" 500 304 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> 2022-08-18T22:02:29.386586+0000 "POST
> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEcOHSKB_zJAs
> HTTP/1.1" 500 304 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> 2022-08-18T22:02:30.911130+0000 "POST
> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEcOHSKB_zJAs
> HTTP/1.1" 500 304 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> 2022-08-18T22:02:30.999129+0000 "DELETE
> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEcOHSKB_zJAs
> HTTP/1.1" 204 0 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> 2022-08-18T22:02:42.782544+0000 "GET /sql-backup-de/IM_DIFFERENTIAL_22.bak
> HTTP/1.1" 200 0 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> 2022-08-18T22:04:29.538+0000 7f7ba9fcb700  0 req 9033182355071581504
> 183.407425780s s3:complete_multipart WARNING: failed to remove object
> sql-backup-de:_multipart_IM_DIFFERENTIAL_22.bak.2~ehGVVRPV3LnWW31bRmBEcOHSKB_zJAs.meta
> 2022-08-18T22:04:29.542210+0000 "POST
> /sql-backup-de/IM_DIFFERENTIAL_22.bak?uploadId=2~ehGVVRPV3LnWW31bRmBEcOHSKB_zJAs
> HTTP/1.1" 200 334 - "Boto3/1.24.23 Python/3.10.5 Linux/5.10.102-flatcar
> Botocore/1.27.23" -
> 2022-08-18T22:04:29.542+0000 7f7ba9fcb700  0 req 9033182355071581504
> 183.411425768s s3:complete_multipart WARNING: failed to unlock
> CLUSTERUUID.BUCKET.INDENTIFIER__multipart_IM_DIFFERENTIAL_22.bak.2~ehGVVRPV3LnWW31bRmBEcOHSKB_zJAs.meta
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux