Re: crashed+down+peering

Christian Brunner <chb@xxxxxx> · Thu, 2 Dec 2010 20:49:58 +0100

Hi Sage,

2010/12/2 Sage Weil <sage@xxxxxxxxxxxx>:
> Hi Christian,
>
> On Thu, 2 Dec 2010, Christian Brunner wrote:
>> We have simulated the simultanious crash of multiple osds in our
>> environment. After starting all the cosd again, we have the following
>> situation:
>>
>> 2010-12-02 16:18:33.944436    pg v724432: 3712 pgs: 1 active, 3605
>> active+clean, 1 crashed+peering, 46 down+peering, 56
>> crashed+down+peering, 3 active+clean+inconsistent; 177 GB data, 365 GB
>> used, 83437 GB / 83834 GB avail; 1/93704 degraded (0.001%)
>>
>> When I set of an "rbd rm" command for one of our rbd volumes, it seems
>> to hit the the "crashed+down+peering" pg. After that the command is
>> stuck.
>
> The pg isn't active, so any IO will hang until peering completes.  What
> version of the code are you running?  If it's something from unstable
> from the last couple of weeks it's probably related to problems there;
> please upgrade and restart the osds.  If it's the latest and greatest
> 'rc', we should look at the logs to see what's going on!

We are running 0.23 - I will upgrade to the latest 'rc' tomorrow.

Thanks, Christian
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html