Re: Fixing a rgw bucket index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We ended up directly importing our original files to another bucket. Now we're cleaning the files in the broken bucket.

Thanks for all the help.


On Mon, Apr 8, 2013 at 10:27 PM, Erdem Agaoglu <erdem.agaoglu@xxxxxxxxx> wrote:
There seems to be an open issue at s3cmd https://github.com/s3tools/s3cmd/issues/37. I'll try with other tools


On Mon, Apr 8, 2013 at 9:26 PM, Yehuda Sadeh <yehuda@xxxxxxxxxxx> wrote:
This one fails because copy object into itself would only work if
replacing it's attrs (X_AMZ_METADATA_DIRECTIVE=REPLACE).

On Mon, Apr 8, 2013 at 10:35 AM, Erdem Agaoglu <erdem.agaoglu@xxxxxxxxx> wrote:
> This is the log grepped with the relevant threadid. It shows 400 in the last
> lines but nothing seems odd besides that.
> http://pastebin.com/xWCYmnXV
>
> Thanks for your interest.
>
>
> On Mon, Apr 8, 2013 at 8:21 PM, Yehuda Sadeh <yehuda@xxxxxxxxxxx> wrote:
>>
>> Each bucket has a unique prefix which you can get by doing radosgw-admin
>> bucket stats on that bucket. You can grep that prefix in 'rados ls -p
>> .rgw.buckets'.
>>
>> Do you have any rgw log showing why you get the Invalid Request response?
>> Can you also add 'debug ms = 1' for the log?
>>
>> Thanks
>>
>>
>> On Mon, Apr 8, 2013 at 10:12 AM, Erdem Agaoglu <erdem.agaoglu@xxxxxxxxx>
>> wrote:
>>>
>>> Just tried that file:
>>>
>>> $ s3cmd mv s3://imgiz/data/avatars/492/492923.jpg
>>> s3://imgiz/data/avatars/492/492923.jpg
>>> ERROR: S3 error: 400 (InvalidRequest)
>>>
>>> a more verbose output shows that the sign-headers was
>>>
>>> 'PUT\n\n\n\nx-amz-copy-source:/imgiz/data/avatars/492/492923.jpg\nx-amz-date:Mon,
>>> 08 Apr 2013 16:59:30
>>> +0000\nx-amz-metadata-directive:COPY\n/imgiz/data/avatars/492/492923.jpg'
>>>
>>> But i guess it doesn't work even if the index is correct. I get the same
>>> response on a clear bucket too.
>>>
>>> We might try that but we don't have a file list. I guess its possible
>>> with 'rados ls | grep | sed' ?
>>>
>>>
>>> On Mon, Apr 8, 2013 at 7:53 PM, Yehuda Sadeh <yehuda@xxxxxxxxxxx> wrote:
>>>>
>>>> Can you try copying one of these objects to itself? Would that work
>>>> and/or change the index entry? Another option would be to try copying all
>>>> the objects to a different bucket.
>>>>
>>>>
>>>> On Mon, Apr 8, 2013 at 9:48 AM, Erdem Agaoglu <erdem.agaoglu@xxxxxxxxx>
>>>> wrote:
>>>>>
>>>>> omap header and all other omap attributes was destroyed. I copied
>>>>> another index over the destroyed one to get a somewhat valid header and it
>>>>> seems intact. After a 'check --fix':
>>>>>
>>>>>  # rados -p .rgw.buckets getomapheader .dir.4470.1
>>>>> header (49 bytes) :
>>>>> 0000 : 03 02 2b 00 00 00 01 00 00 00 01 02 02 18 00 00 :
>>>>> ..+.............
>>>>> 0010 : 00 7d 7a 3f 6e 01 00 00 00 00 d0 00 7e 01 00 00 :
>>>>> .}z?n.......~...
>>>>> 0020 : 00 bb f5 01 00 00 00 00 00 00 00 00 00 00 00 00 :
>>>>> ................
>>>>> 0030 : 00                                              : .
>>>>>
>>>>>
>>>>> Rados shows objects are there:
>>>>>
>>>>> # rados ls -p .rgw.buckets |grep 4470.1_data/avatars
>>>>> 4470.1_data/avatars/11047/11047823_20101211154308.jpg
>>>>> 4470.1_data/avatars/106/106976-orig
>>>>> 4470.1_data/avatars/492/492923.jpg
>>>>> 4470.1_data/avatars/275/275479.jpg
>>>>> ...
>>>>>
>>>>>
>>>>> And i am able to GET them
>>>>>
>>>>> $ s3cmd get s3://imgiz/data/avatars/492/492923.jpg
>>>>> s3://imgiz/data/avatars/492/492923.jpg -> ./492923.jpg  [1 of 1]
>>>>>  3587 of 3587   100% in    0s    93.40 kB/s  done
>>>>>
>>>>>
>>>>> But unable to list them
>>>>>
>>>>> $ s3cmd ls s3://imgiz/data/avatars
>>>>> <NOTHING>
>>>>>
>>>>>
>>>>> My initial expectation was that 'bucket check --fix --check-objects'
>>>>> will actually read the files like 'rados ls' does and would recreate the
>>>>> missing omapkeys but it doesn't seem to do that. Now a simple check says
>>>>>
>>>>> # radosgw-admin bucket check -b imgiz
>>>>> { "existing_header": { "usage": { "rgw.main": { "size_kb": 6000607,
>>>>>               "size_kb_actual": 6258740,
>>>>>               "num_objects": 128443}}},
>>>>>   "calculated_header": { "usage": { "rgw.main": { "size_kb": 6000607,
>>>>>               "size_kb_actual": 6258740,
>>>>>               "num_objects": 128443}}}}
>>>>>
>>>>> But i know we have more than 128k objects.
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 8, 2013 at 7:17 PM, Yehuda Sadeh <yehuda@xxxxxxxxxxx>
>>>>> wrote:
>>>>>>
>>>>>> We'll need to have more info about the current state. Was just the
>>>>>> omap header destroyed, or does it still exist? What does the header
>>>>>> contain now? Are you able to actually access objects in that bucket,
>>>>>> but just fail to list them?
>>>>>>
>>>>>> On Mon, Apr 8, 2013 at 8:34 AM, Erdem Agaoglu
>>>>>> <erdem.agaoglu@xxxxxxxxx> wrote:
>>>>>> > Hi again,
>>>>>> >
>>>>>> > I managed to change the file with some other bucket's index.
>>>>>> > --check-objects
>>>>>> > --fix worked but my hopes have failed as it didn't actually read
>>>>>> > through the
>>>>>> > files or fixed anything. Any suggestions?
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Apr 4, 2013 at 5:56 PM, Erdem Agaoglu
>>>>>> > <erdem.agaoglu@xxxxxxxxx>
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> Hi all,
>>>>>> >>
>>>>>> >> After a major failure, and getting our cluster health back OK (with
>>>>>> >> some
>>>>>> >> help from inktank folks, thanks), we found out that we have managed
>>>>>> >> to
>>>>>> >> corrupt one of our bucket indices. As far as i can track it, we are
>>>>>> >> missing
>>>>>> >> the omapheader on that specific index, so we're unable to use
>>>>>> >> radosgw-admin
>>>>>> >> tools to fix it.
>>>>>> >>
>>>>>> >> While a healthy (smaller) bucket answers
>>>>>> >> # radosgw-admin bucket check -b imgdoviz
>>>>>> >> { "existing_header": { "usage": { "rgw.main": { "size_kb": 4140,
>>>>>> >>               "size_kb_actual": 4484,
>>>>>> >>               "num_objects": 157}}},
>>>>>> >>   "calculated_header": { "usage": { "rgw.main": { "size_kb": 4140,
>>>>>> >>               "size_kb_actual": 4484,
>>>>>> >>               "num_objects": 157}}}}
>>>>>> >>
>>>>>> >> The faulty one fails with
>>>>>> >> # radosgw-admin bucket check -b imgiz
>>>>>> >> failed to list objects in bucket=imgiz(@.rgw.buckets[4470.1])
>>>>>> >> err=(22)
>>>>>> >> Invalid argument
>>>>>> >> failed to check index err=(22) Invalid argument
>>>>>> >>
>>>>>> >> When i push further
>>>>>> >> # radosgw-admin bucket check -b imgiz --check-objects --fix
>>>>>> >> failed to list objects in bucket=imgiz(@.rgw.buckets[4470.1])
>>>>>> >> err=(22)
>>>>>> >> Invalid argument
>>>>>> >> Checking objects, decreasing bucket 2-phase commit timeout.
>>>>>> >> ** Note that timeout will reset only when operation completes
>>>>>> >> successfully
>>>>>> >> **
>>>>>> >> ERROR: failed operation r=-22
>>>>>> >> ERROR: failed operation r=-22
>>>>>> >> ..
>>>>>> >> last line keeps repeating without any progress.
>>>>>> >>
>>>>>> >> I checked the file omapheaders and while the healty bucket has:
>>>>>> >> # rados -p .rgw.buckets getomapheader .dir.6912.3
>>>>>> >> header (49 bytes) :
>>>>>> >> 0000 : 03 02 2b 00 00 00 01 00 00 00 01 02 02 18 00 00 :
>>>>>> >> ..+.............
>>>>>> >> 0010 : 00 a8 af 40 00 00 00 00 00 00 10 46 00 00 00 00 :
>>>>>> >> ...@.......F....
>>>>>> >> 0020 : 00 9d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 :
>>>>>> >> ................
>>>>>> >> 0030 : 00                                              : .
>>>>>> >>
>>>>>> >> the faulty one is missing it
>>>>>> >> # rados -p .rgw.buckets getomapheader .dir.4470.1
>>>>>> >> header (0 bytes) :
>>>>>> >>
>>>>>> >>
>>>>>> >> I'm currently in the process of understanding how to create a
>>>>>> >> readable
>>>>>> >> header. My hopes are even while its wrong, radosgw-admin will be
>>>>>> >> able to
>>>>>> >> read through objects and fix the necessary parts. But i'm not sure
>>>>>> >> how to
>>>>>> >> set the new-header. It seems there is a setomapheader counterpart
>>>>>> >> for
>>>>>> >> getomapheader but it only accepts values from commandline so i
>>>>>> >> don't know
>>>>>> >> how to push a rgw-readable binary with it.
>>>>>> >>
>>>>>> >> Is this somewhat possible?
>>>>>> >>
>>>>>> >> Thanks in advance.
>>>>>> >>
>>>>>> >> --
>>>>>> >> erdem agaoglu
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > erdem agaoglu
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > ceph-users mailing list
>>>>>> > ceph-users@xxxxxxxxxxxxxx
>>>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>> >
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> erdem agaoglu
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> erdem agaoglu
>>
>>
>
>
>
> --
> erdem agaoglu



--
erdem agaoglu



--
erdem agaoglu
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux