Re: osd marked down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm not sure if anything else could break, but since the OSD isn't starting anyway... I guess you could delete osd.3 from ceph auth:

ceph auth del osd.3

And then recreate it with:

ceph auth get-or-create osd.3 mon 'allow profile osd' osd 'allow *' mgr 'allow profile osd'
[osd.3]
        key = <NEW_KEY>

Then create a keyring file /var/lib/ceph/osd/ceph-3/keyring with the respective content:

[osd.3]
        key = <NEW_KEY>
        caps mgr = "allow profile osd"
        caps mon = "allow profile osd"
        caps osd = "allow *"


Make sure the file owner is ceph and try to restart the OSD. In this case you wouldn't need to import anything. This just worked for me in my lab environment, so give it a shot.



Zitat von Abdelillah Asraoui <aasraoui@xxxxxxxxx>:

the /var/lib/ceph/osd/ceph-3/keyring is missing here ..
is there way to generate a keyring for osd.3 ?


thanks!

On Thu, Sep 30, 2021 at 1:18 AM Eugen Block <eblock@xxxxxx> wrote:

Is the content of OSD.3 still available in the filesystem? If the
answer is yes you can get the OSD's keyring from

/var/lib/ceph/osd/ceph-3/keyring

Then update your osd.3.export file with the correct keyring and then
import the correct back to ceph.


Zitat von Abdelillah Asraoui <aasraoui@xxxxxxxxx>:

> I must have imported osd.2 key instead,  now osd.3 has the same key as
osd.2
>
> ceph auth import -i osd.3.export
>
>
> How do we update this ?
>
> thanks!
>
>
>
> On Wed, Sep 29, 2021 at 2:13 AM Eugen Block <eblock@xxxxxx> wrote:
>
>> Just to clarify, you didn't simply import the unchanged keyring but
>> modified it to reflect the actual key of OSD.3, correct? If not, run
>> 'ceph auth get osd.3' first and set the key in the osd.3.export file
>> before importing it to ceph.
>>
>>
>> Zitat von Abdelillah Asraoui <aasraoui@xxxxxxxxx>:
>>
>> > i have created keyring for the osd3 but still pod is not booting up..
>> >
>> > As outlined:
>> > https://access.redhat.com/solutions/3524771
>> >
>> > ceph auth export osd.2 -o osd.2.export
>> > cp osd.2.export osd.3.export
>> > ceph auth import -i osd.3.export
>> > imported keyring
>> >
>> >
>> > Any suggestions ?
>> >
>> > Thanks!
>> >
>> > On Tue, Sep 21, 2021 at 8:34 AM Abdelillah Asraoui <
aasraoui@xxxxxxxxx>
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> one of the osd in the cluster went down, is there a workaround to
bring
>> >> back this osd?
>> >>
>> >>
>> >> logs from ceph osd pod shows the following:
>> >>
>> >> kubectl -n rook-ceph logs rook-ceph-osd-3-6497bdc65b-pn7mg
>> >>
>> >> debug 2021-09-20T14:32:46.388+0000 7f930fe9cf00 -1 auth: unable to
find
>> a
>> >> keyring on /var/lib/ceph/osd/ceph-3/keyring: (13) Permission denied
>> >>
>> >> debug 2021-09-20T14:32:46.389+0000 7f930fe9cf00 -1 auth: unable to
find
>> a
>> >> keyring on /var/lib/ceph/osd/ceph-3/keyring: (13) Permission denied
>> >>
>> >> debug 2021-09-20T14:32:46.389+0000 7f930fe9cf00 -1 auth: unable to
find
>> a
>> >> keyring on /var/lib/ceph/osd/ceph-3/keyring: (13) Permission denied
>> >>
>> >> debug 2021-09-20T14:32:46.389+0000 7f930fe9cf00 -1 monclient: keyring
>> not
>> >> found
>> >>
>> >> failed to fetch mon config (--no-mon-config to skip)
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> kubectl -n rook-ceph describe pod  rook-ceph-osd-3-64
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> Events:
>> >>
>> >>   Type     Reason   Age                      From     Message
>> >>
>> >>   ----     ------   ----                     ----     -------
>> >>
>> >>   Normal   Pulled   50m (x749 over 2d16h)    kubelet  Container image
>> >> "ceph/ceph:v15.2.13" already present on machine
>> >>
>> >>   Warning  BackOff  19s (x18433 over 2d16h)  kubelet  Back-off
>> restarting
>> >> failed container
>> >>
>> >>
>> >>
>> >> ceph health detail | more
>> >>
>> >> HEALTH_WARN noout flag(s) set; 1 osds down; 1 host (1 osds) down;
>> Degraded
>> >> data redundancy: 180969/542907 objects degraded (33.333%), 225 pgs
degra
>> >>
>> >> ded, 225 pgs undersized
>> >>
>> >> [WRN] OSDMAP_FLAGS: noout flag(s) set
>> >>
>> >> [WRN] OSD_DOWN: 1 osds down
>> >>
>> >>     osd.3 (root=default,host=ab-test) is down
>> >>
>> >> [WRN] OSD_HOST_DOWN: 1 host (1 osds) down
>> >>
>> >>     host ab-test-mstr-1-cwan-net (root=default) (1 osds) is down
>> >>
>> >> [WRN] PG_DEGRADED: Degraded data redundancy: 180969/542907 objects
>> >> degraded (33.333%), 225 pgs degraded, 225 pgs undersized
>> >>
>> >>     pg 3.4d is active+undersized+degraded, acting [2,0]
>> >>
>> >>     pg 3.4e is stuck undersized for 3d, current state
>> >> active+undersized+degraded, last acting [0,2]
>> >>
>> >>
>> >> Thanks!
>> >>
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@xxxxxxx
>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>







_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux