Re: two osd stack on peereng after start osd to recovery

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There is almost same problem with the 0.61 cluster, at least with same
symptoms. Could be reproduced quite easily - remove an osd and then
mark it as out and with quite high probability one of neighbors will
be stuck at the end of peering process with couple of peering pgs with
primary copy on it. Such osd process seems to be stuck in some kind of
lock, eating exactly 100% of one core.

On Thu, Jun 13, 2013 at 8:42 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
> On Thu, Jun 13, 2013 at 6:33 AM, Sławomir Skowron <szibis@xxxxxxxxx> wrote:
>> Hi, sorry for late response.
>>
>> https://docs.google.com/file/d/0B9xDdJXMieKEdHFRYnBfT3lCYm8/view
>>
>> Logs in attachment, and on google drive, from today.
>>
>> https://docs.google.com/file/d/0B9xDdJXMieKEQzVNVHJ1RXFXZlU/view
>>
>> We have such problem today. And new logs are on google drive with today date.
>>
>> Strange is that problematic osd.71 have about 10-15%, more space used
>> then other osd in cluster.
>>
>> Today in one hour osd.71 fails 3 times in mon log, and after third
>> recovery has been stuck, and many 500 errors appears in http layer on
>> top of rgw. When it's stuck, restarting osd71, osd.23, and osd.108,
>> all from stucked pg, helps, but i run even repair on this osd, just in
>> case.
>>
>> I have some theory, that on this pg is rgw index of objects, or one of
>> osd in this pg, have some problems with local filesystem or drive
>> bellow (raid controller reports nothing about that), but i do not see
>> any problem in system.
>>
>> How can we find in which pg/osd index of objects in rgw bucket exist ??
>
> You can find the location of any named object by grabbing the OSD map
> from the cluster and using the osdmaptool: "osdmaptool <mapfile>
> --test-map-object <objname> --pool <poolid>".
>
> You're not providing any context for your issue though, so we really
> can't help. What symptoms are you observing?
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux