Re: NoSuchKey on key that is visible in s3 list/radosgw bk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I found some of the data in the rados ls dump. We host some WARCs from the Internet Archive and one affected WARC still has its warc.os.cdx.gz file intact, while the actual warc.gz is gone.

A rados stat revealed

WIDE-20110903143858-01166.warc.os.cdx.gz mtime 2019-07-14T17:48:39.000000+0200, size 1060428

for the cdx.gz file, but

WIDE-20110903143858-01166.warc.gz mtime 2019-07-14T17:04:49.000000+0200, size 0

for the warc.gz.

I couldn't find any of the suffixed multipart objects listed in radosgw-admin stat.

WIDE-20110903143858-01166.warc.gz.2~m5Y42lPMIeis5qgJAZJfuNnzOKd7lme.19: (2) No such file or directory


On 10/11/2020 10:14, Janek Bevendorff wrote:
Thanks for the reply. This issue seems to be VERY serious. New objects are disappearing every day. This is a silent, creeping data loss.

I couldn't find the object with rados stat, but I am now listing all the objects and will grep the dump to see if there is anything left.

Janek

On 09/11/2020 23:31, Rafael Lopez wrote:
Hi Mariusz, all

We have seen this issue as well, on redhat ceph 4 (I have an unresolved case open). In our case, `radosgw-admin stat` is not a sufficient check to guarantee that there are rados objects. You have to do a `rados stat` to know that.

In your case, the object is ~48M in size, appears to also use S3 multipart. This means, when uploaded, S3 will slice it up into parts based on what S3 multipart size you use (5M default, i think 8M here). After that, rados further slices any incoming (multipart size objects) into rados object objects of 4Mib size (default).

The end result is you have a bunch of rados objects labelled with the 'prefix' from the `radosgw-admin stat` you ran, as well as a head object (named the same as the S3 object you uploaded) that contains the metadata so rgw knows how to put the S3 object back together. In our case, the head object is there but the other rados pieces that hold the actual data seem to be gone, so `radosgw-admin stat` returns fine, but we get NoSuchKey when trying to download.

Try `rados -p {rgw buckets pool} stat 255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4`, it will show you the rados stat of the head object, which will be much smaller than the S3 object.

To check if you actually have all rados objects for this 48M S3 object, try searching for parts of the prefix or the whole prefix on a list of all rados objects in buckets pool. FYI, the `rados ls` will list every rados object in the bucket, so it may be very large and take a long time if you have many objects.

rados -p {rgw buckets pool} ls > {tmpfile}
grep '2~NTy88SkDkXR9ifSrrRcw5WPDxqN3PO2' {tmpfile}
grep 'juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4' {tmpfile}

The first grep is actually the S3 multipart ID string added to the prefix by rgw.

Rafael

On Tue, 10 Nov 2020 at 01:04, Janek Bevendorff <janek.bevendorff@xxxxxxxxxxxxx <mailto:janek.bevendorff@xxxxxxxxxxxxx>> wrote:

    We are having the exact same problem (also Octopus). The object is
    listed by s3cmd, but trying to download it results in a 404 error.
    radosgw-admin object stat shows that the object still exists. Any
    further ideas how I can restore access to this object?

    (Sorry if this is a duplicate, but it seems like the mailing list
    hasn't
    accepted my original mail).


    > Mariusz Gronczewski wrote:
    >
    >
    >> Dnia 2020-07-27, o godz. 21:31:33
    >> "Robin H. Johnson" <robbat2@xxxxxxxxxx <mailto:robbat2@xxxxxxxxxx>
    >> <mailto:robbat2@xxxxxxxxxx
    <mailto:robbat2@xxxxxxxxxx>>> napisał(a):
    >>
    >>
    >>>
On Mon, Jul 27, 2020 at 08:02:23PM +0200, Mariusz Gronczewski wrote:
    >>>
    >>>> Hi,
    >>>> I've got a problem on Octopus (15.2.3, debian packages) install,
    >>>> bucket S3 index shows a file:
    >>>>     s3cmd ls s3://upvid/255/38355 --recursive
    >>>>     2020-07-27 17:48  50584342
    >>>>
    s3://upvid/255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4
    >>>> radosgw-admin bi list also shows it
    >>>>     {
    >>>>         "type": "plain",
    >>>>         "idx":
    >>>>
    "255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4",
    >>>>     "entry": { "name":
    >>>>
    "255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4",
    >>>>     "instance": "", "ver": {
    >>>>                 "pool": 11,
    >>>>                 "epoch": 853842
    >>>>             },
    >>>>             "locator": "",
    >>>>             "exists": "true",
    >>>>             "meta": {
    >>>>                 "category": 1,
    >>>>                 "size": 50584342,
    >>>>                 "mtime": "2020-07-27T17:48:27.203008Z",
    >>>>                 "etag": "2b31cc8ce8b1fb92a5f65034f2d12581-7",
    >>>>                 "storage_class": "",
    >>>>                 "owner": "filmweb-app",
    >>>>                 "owner_display_name": "filmweb app user",
    >>>>                 "content_type": "",
    >>>>                 "accounted_size": 50584342,
    >>>>                 "user_data": "",
    >>>>                 "appendable": "false"
    >>>>             },
    >>>>             "tag": "_3ubjaztglHXfZr05wZCFCPzebQf-ZFP",
    >>>>             "flags": 0,
    >>>>             "pending_map": [],
    >>>>             "versioned_epoch": 0
    >>>>         }
    >>>>     },
    >>>>
but trying to download it via curl (I've set permissions to public0
    >>>>
    >>> only gets me
    >>> Does the RADOS object for this still exist?
    >>>
    >>> try:
    >>> radosgw-admin object stat --bucket ... --object
    >>>
'255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4'
    >>>
    >>>
If that doesn't return, then the backing object is gone, and you have
    >>>
a stale index entry that can be cleaned up in most cases with check
    >>> bucket.
    >>>
For cases where that doesn't fix it, my recommended way to fix it is
    >>> write a new 0-byte object to the same name, then delete it.
    >>>
    >>
    >>
    >>
    >> it does exist:
    >>
    >> {
    >>   "name":
    >>
"255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4",
    >> "size": 50584342, "policy": {
    >>       "acl": {...},
    >>       "owner": {...}
    >>   },
    >>   "etag": "2b31cc8ce8b1fb92a5f65034f2d12581-7",
    >>   "tag": "_3ubjaztglHXfZr05wZCFCPzebQf-ZFP",
    >>   "manifest": {
    >>       "objs": [],
    >>       "obj_size": 50584342,
    >>       "explicit_objs": "false",
    >>       "head_size": 0,
    >>       "max_head_size": 0,
    >>       "prefix":
    >>
"255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4.2~NTy88SkDkXR9ifSrrRcw5WPDxqN3PO2",
    >> "rules": [ {
    >>               "key": 0,
    >>               "val": {
    >>                   "start_part_num": 1,
    >>                   "start_ofs": 0,
    >>                   "part_size": 8388608,
    >>                   "stripe_max_size": 4194304,
    >>                   "override_prefix": ""
    >>               }
    >>           },
    >>           {
    >>               "key": 50331648,
    >>               "val": {
    >>                   "start_part_num": 7,
    >>                   "start_ofs": 50331648,
    >>                   "part_size": 252694,
    >>                   "stripe_max_size": 4194304,
    >>                   "override_prefix": ""
    >>               }
    >>           }
    >>       ],
    >>       "tail_instance": "",
    >>       "tail_placement": {
    >>           "bucket": {
    >>               "name": "upvid",
    >>               "marker":
    >> "88d4f221-0da5-444d-81a8-517771278350.665933.2", "bucket_id":
    >> "88d4f221-0da5-444d-81a8-517771278350.665933.2", "tenant": "",
    >>               "explicit_placement": {
    >>                   "data_pool": "",
    >>                   "data_extra_pool": "",
    >>                   "index_pool": ""
    >>               }
    >>           },
    >>           "placement_rule": "default-placement"
    >>       },
    >>       "begin_iter": {
    >>           "part_ofs": 0,
    >>           "stripe_ofs": 0,
    >>           "ofs": 0,
    >>           "stripe_size": 4194304,
    >>           "cur_part_id": 1,
    >>           "cur_stripe": 0,
    >>           "cur_override_prefix": "",
    >>           "location": {
    >>               "placement_rule": "default-placement",
    >>               "obj": {
    >>                   "bucket": {
    >>                       "name": "upvid",
    >>                       "marker":
    >> "88d4f221-0da5-444d-81a8-517771278350.665933.2", "bucket_id":
    >> "88d4f221-0da5-444d-81a8-517771278350.665933.2", "tenant": "",
    >>                       "explicit_placement": {
    >>                           "data_pool": "",
    >>                           "data_extra_pool": "",
    >>                           "index_pool": ""
    >>                       }
    >>                   },
    >>                   "key": {
    >>                       "name":
    >>
"255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4.2~NTy88SkDkXR9ifSrrRcw5WPDxqN3PO2.1",
    >> "instance": "", "ns": "multipart"
    >>                   }
    >>               },
    >>               "raw_obj": {
    >>                   "pool": "",
    >>                   "oid": "",
    >>                   "loc": ""
    >>               },
    >>               "is_raw": false
    >>           }
    >>       },
    >>       "end_iter": {
    >>           "part_ofs": 50584342,
    >>           "stripe_ofs": 50584342,
    >>           "ofs": 50584342,
    >>           "stripe_size": 252694,
    >>           "cur_part_id": 8,
    >>           "cur_stripe": 0,
    >>           "cur_override_prefix": "",
    >>           "location": {
    >>               "placement_rule": "default-placement",
    >>               "obj": {
    >>                   "bucket": {
    >>                       "name": "upvid",
    >>                       "marker":
    >> "88d4f221-0da5-444d-81a8-517771278350.665933.2", "bucket_id":
    >> "88d4f221-0da5-444d-81a8-517771278350.665933.2", "tenant": "",
    >>                       "explicit_placement": {
    >>                           "data_pool": "",
    >>                           "data_extra_pool": "",
    >>                           "index_pool": ""
    >>                       }
    >>                   },
    >>                   "key": {
    >>                       "name":
    >>
"255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4.2~NTy88SkDkXR9ifSrrRcw5WPDxqN3PO2.8",
    >> "instance": "", "ns": "multipart"
    >>                   }
    >>               },
    >>               "raw_obj": {
    >>                   "pool": "",
    >>                   "oid": "",
    >>                   "loc": ""
    >>               },
    >>               "is_raw": false
    >>           }
    >>       }
    >>   },
    >>   "attrs": {
    >>       "user.rgw.pg_ver": "/u0005u",
    >>       "user.rgw.source_zone": "�I}�",
    >>       "user.rgw.tail_tag":
    >> "88d4f221-0da5-444d-81a8-517771278350.658638.5538809",
    >> "user.rgw.x-amz-acl": "private", "user.rgw.x-amz-content-sha256":
    >> "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    >> "user.rgw.x-amz-date": "20200619T194517Z" }
    >> }
    >>
    >>
    >>
    >>
    >>
    >> --
    >> Mariusz Gronczewski, Administrator
    >>
    >> Efigence S. A.
    >> ul. Wołoska 9a, 02-583 Warszawa
    >> T:   [+48] 22 380 13 13
    >> NOC: [+48] 22 380 10 20
    >> E: admin@xxxxxxxxxxxx <mailto:admin@xxxxxxxxxxxx>
    <mailto:admin@xxxxxxxxxxxx <mailto:admin@xxxxxxxxxxxx>>
    >> _______________________________________________
    >> ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx> <mailto:ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>>
    >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>
    >> <mailto:ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx>>
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>
    To unsubscribe send an email to ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>



--
*Rafael Lopez*
Devops Systems Engineer
Monash University eResearch Centre

T: +61 3 9905 9118 <tel:%2B61%203%209905%209118>
E: rafael.lopez@xxxxxxxxxx <mailto:rafael.lopez@xxxxxxxxxx>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux