Re: OSD not coming up after being set down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2-3-2016 21:11, Samuel Just wrote:
> At this point, you will want to run the script and then dig through
> the logs until you find something that doesn't match.
> - Was osd.0 up to begin with?
> - Is its process running?
> - Did it get the map marking it down?
> - Did it send a boot message back to the mon requesting that it be
> marked back up?
> - Did the mon get that message?
> - Did the mon create a new map marking it up?

Right this is sort of a handholding I was looking for.

The first 2 items are true.
Who sends "the map marking it down"?
	ceph osd down 0 => Mon => Osd
Or does that go directly ceph => Osd

Are there any statemachine pictures of this in the manuals?

--WjW

> Etc
> -Sam
> 
> On Wed, Mar 2, 2016 at 11:56 AM, Willem Jan Withagen <wjw@xxxxxxxxxxx> wrote:
>> On 2-3-2016 18:01, M Ranga Swami Reddy wrote:
>>> Please see the below:
>>> ---
>>> The If something is causing OSDs to ‘flap’ (repeatedly getting marked
>>> down and then up again), you can force the monitors to stop the
>>> flapping with:
>>>
>>> ceph osd set noup      # prevent OSDs from getting marked up
>>> ceph osd set nodown    # prevent OSDs from getting marked down
>>> ----
>>> ref: http://docs.ceph.com/docs/hammer/rados/troubleshooting/troubleshooting-osd/
>>
>> I don't think this is the issue.
>>
>> The testcode should run as is. This run on Linux oke, but FreeBSD is
>> giving trouble.
>> The OSD should get up, but does not.
>> - OSD not receiving the UP
>> - OSD not able to go UP
>> - Or the monitors are not picking up?
>>
>> --WjW
>>
>>> On Wed, Mar 2, 2016 at 9:33 PM, Willem Jan Withagen <wjw@xxxxxxxxxxx> wrote:
>>>> Hi,
>>>>
>>>> Any handholding is welcomed!!
>>>>
>>>> In test/cephtool-mon-test.sh part of the excuted code is:
>>>>  ceph osd down 0
>>>>   ceph osd dump | grep 'osd.0 down'
>>>>   ceph osd unset noup
>>>>   for ((i=0; i < 120; i++)); do
>>>>     if ! ceph osd dump | grep 'osd.0 up'; then
>>>>       echo "waiting for osd.0 to come back up"
>>>>       sleep 1
>>>>     else
>>>>       break
>>>>     fi
>>>>   done
>>>>   ceph osd dump | grep 'osd.0 up'
>>>>
>>>> But the OSD refused to come back up.
>>>> Below the output of the dump.
>>>>
>>>> How would I start analyzing this issue?
>>>> What kind of things would I expect to see in the logfile?
>>>>   What if the OSD does come up
>>>>   What if the OSD stays down
>>>>
>>>> Thanx,
>>>> --WjW
>>>>
>>>>
>>>> *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
>>>> epoch 170
>>>> fsid 8b5c0b4b-e08c-11e5-8cd4-1c6f6582ec12
>>>> created 2016-03-02 16:36:35.001700
>>>> modified 2016-03-02 16:45:17.802073
>>>> flags sortbitwise
>>>> pool 0 'rbd' replicated size 3 min_size 1 crush_ruleset 0 object_hash
>>>> rjenkins pg_num 8 pgp_num 8 last_change 1 flags hashpspool stripe_width 0
>>>> max_osd 3
>>>> osd.0 down out weight 0 up_from 4 up_thru 163 down_at 166
>>>> last_clean_interval [0,0) 127.0.0.1:6804/2455 127.0.0.1:6805/2455
>>>> 127.0.0.1:6806/2455 127.0.0.1:6807/2455 autoout,exists
>>>> 8bc29c74-e08c-11e5-8cd4-1c6f6582ec12
>>>> osd.1 up   in  weight 1 up_from 8 up_thru 166 down_at 0
>>>> last_clean_interval [0,0) 127.0.0.1:6808/2475 127.0.0.1:6811/2475
>>>> 127.0.0.1:6813/2475 127.0.0.1:6816/2475 exists,up
>>>> 8d7a2cb5-e08c-11e5-8cd4-1c6f6582ec12
>>>> osd.2 up   in  weight 1 up_from 13 up_thru 166 down_at 0
>>>> last_clean_interval [0,0) 127.0.0.1:6817/2495 127.0.0.1:6818/2495
>>>> 127.0.0.1:6819/2495 127.0.0.1:6820/2495 exists,up
>>>> 8f46df05-e08c-11e5-8cd4-1c6f6582ec12
>>>> pg_temp 0.0 [0,2,1]
>>>> pg_temp 0.1 [2,0,1]
>>>> pg_temp 0.2 [0,1,2]
>>>> pg_temp 0.3 [2,0,1]
>>>> pg_temp 0.4 [0,2,1]
>>>> pg_temp 0.5 [0,2,1]
>>>> pg_temp 0.6 [0,1,2]
>>>> pg_temp 0.7 [1,0,2]
>>>> 2016-03-02 16:56:11.027977 8021d7800  0 lockdep stop
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux