Re: How to remove lost objects.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2012/1/19 Gregory Farnum <gregory.farnum@xxxxxxxxxxxxx>:
> On Wed, Jan 18, 2012 at 12:48 PM, Andrey Stepachev <octo47@xxxxxxxxx> wrote:
>> But still don't know what happens with ceph, so it can't
>> respond and hang. It is not a good behavior, because
>> such situation leads to unresponsible cluster in case of
>> temporal network failure.
>
> I'm a little concerned about this — I would expect to see hangs of up
> to ~30 seconds (the timeout period), but for operations to then
> continue. Are you putting the MDS down? If so, do you have any
> standbys specified?

Yes, MDS goes down (I restart it at some point, while changing something
in config).
Yes, i have 2 standbys.
Clients hang more then 10 minutes.

>
>
>> 2012/1/18 Andrey Stepachev <octo47@xxxxxxxxx>:
>>> Hi,
>>>
>>> I've test ceph against laggy network. (0ms-400ms delays).
>>> At some moment i got many messages like:
>>> 2012-01-18 16:06:49.184776 7ff134119700 -- 84.201.161.73:6801/25424
>>> send_message dropped message osd_op_reply(291 1000000101b.0000001e
>>> [write 66734080~37
>>> 4784] ondisk = 0) v1 because of no pipe on con 0x315e640
>>> And ceph don't respond on ls on some of subdirs (via hadoop fs -ls or
>>> kernel client)
>>> My cluster runs with no debug at that moment, so I can't find what is going on.
>>>
>>> After restart ceph writes to log
>>> 2012-01-18 16:10:39.985509 7f217989d780 osd.1 155 pg[0.155( v 136'373
>>> (94'368,136'373]+backlog n=3 ec=1 les/c 150/145 146/151/58) [] r=0
>>> lpr=0 (info mismatch, log(94'368,0'0]+backlog) (log bound mismatch,
>>> actual=[8'124,94'369]) lcod 0'0 mlcod 0'0 inactive] read_log  got dup
>>> 94'369 (last was 94'369, dropping that one)
>>>
>>> After such strange hangouts i found, that rm -rf on filesystem
>>> (mounted via kernel),
>>> fs shows, that 210Gb still in use. Looking at /data/osd.x i found many
>>> objects inside.
>>> So:
>>> a) looks like some errors lead us to orphaned objects in rados
>>> b) i can't find utility, which can check that orpaned data (and cleanup it)
>>>
>>> Question: how I can identify what objects are, and how I can clean up them.
>>>
>>> --
>>> Andrey.
>>
>>
>>
>> --
>> Andrey.
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Andrey.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux