Re: Problems after crash yesterday

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 23, 2012 at 9:14 PM, Gregory Farnum
<gregory.farnum@xxxxxxxxxxxxx> wrote:
> On Wed, Feb 22, 2012 at 12:25 PM, Jens Rehpöhler
> <jens.rehpoehler@xxxxxxxx> wrote:
>> Hi Gregory,
>>
>>
>> On 22.02.2012 18:12, Gregory Farnum wrote:
>>> On Feb 22, 2012, at 1:53 AM, "Jens Rehpöhler" <jens.rehpoehler@xxxxxxxx> wrote:
>>>
>>>> Some Additios: meanwhile we are at the state:
>>>>
>>>> 2012-02-22 10:38:49.587403    pg v1044553: 2046 pgs: 2036 active+clean,
>>>> 10 active+clean+inconsistent; 2110 GB data, 4061 GB used, 25732 GB /
>>>> 29794 GB avail
>>>>
>>>> The  active+recovering+remapped+backfill disappeared auf a restart of a
>>>> cashed OSD.
>>>>
>>>> The OSD crashed after issuing the command "ceph pg repair 106.3".
>>>>
>>>> The repeating message is also there:
>>> Hmm. These messages indicate there are requests that came in that
>>> never got answered -- or else that the tracking code isn't quite right
>>> (it's new functionality). What version are you running?
>> We use:
>>
>> root@fcmsnode0:~# ceph -v
>> ceph version 0.42-62-gd6de0bb
>> (commit:d6de0bb83bcac238b3a6a376915e06fb7129b2c8)
>>
>> Kernel is 3.2.1
>>
>> i accidently updated one of our OSDs to 0.42 -> So we updated the whole
>> cluster.
>>
>> The OSD repeated to crash while issuing  "repair" command. The
>> inconsistent PGs
>> are all on the same (newly added) node.
>
> Oh, that's interesting. Are all the other nodes in the cluster up and in?
>
> In the next version or two we will have a lot more capability to look
> into what's happening with stuck PGs like this, but for the moment we
> need a log. If all the other nodes in the system are up, can you
> restart this new OSD with "debug osd = 20" and "debug ms = 1" added to
> its config?
> -Greg

Actually, I suspect this might be related to that bug you reported
with the messenger. If you like you can just cherry-pick
244b70296622906f01cfa3d48c931aa08e663a75 (currently HEAD on the next
branch) onto your current install and see if that fixes things...
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux