Re: OSD's fail to start after power loss

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Todd;

What version of ceph are you running?  Are you running containers or packages?  Was the cluster installed manually, or using a deployment tool?

Logs provided are for osd ID 31, is ID 31 appropriate for that server?  Have you verified that the ceph.conf on that server is intact, and correct?

Your log snippet references /var/lib/ceph/osd/ceph-31/keyring; does this file exist?  Does the /var/lib/ceph/osd/ceph-31/ folder exist?  If both exist, are the ownership and permissions correct / appropriate?

Thank you,

Dominic L. Hilsbos, MBA
Vice President - Information Technology
Perform Air International Inc.
DHilsbos@xxxxxxxxxxxxxx
www.PerformAir.com


-----Original Message-----
From: Orbiting Code, Inc. [mailto:support@xxxxxxxxxxxxxxxx] 
Sent: Wednesday, October 13, 2021 7:21 AM
To: ceph-users@xxxxxxx
Subject:  OSD's fail to start after power loss

Hello Everyone,

I have 3 OSD hosts with 12 OSD's each. After a power failure on 1 host, 
all 12 OSD's fail to start on that host. The other 2 hosts did not lose 
power, and are functioning. Obviously I don't want to restart the 
working hosts at this time. Syslog shows:

Oct 12 17:24:07 osd3 systemd[1]: 
ceph-volume@lvm-31-cae13d9a-1d3d-4003-a57f-6ffac21a682e.service: Main 
process exited, code
=exited, status=1/FAILURE
Oct 12 17:24:07 osd3 systemd[1]: 
ceph-volume@lvm-31-cae13d9a-1d3d-4003-a57f-6ffac21a682e.service: Failed 
with result 'exit-
code'.
Oct 12 17:24:07 osd3 systemd[1]: Failed to start Ceph Volume activation: 
lvm-31-cae13d9a-1d3d-4003-a57f-6ffac21a682e.

This is repeated for all 12 OSD's on the failed host. Running the 
following command, shows additional errors.

root@osd3:/var/log# /usr/bin/ceph-osd -f --cluster ceph --id 31 
--setuser ceph --setgroup ceph
2021-10-12 17:50:23.117 7fce92e6ac00 -1 auth: unable to find a keyring 
on /var/lib/ceph/osd/ceph-31/keyring: (2) No such file or directory
2021-10-12 17:50:23.117 7fce92e6ac00 -1 AuthRegistry(0x55c4ec50aa40) no 
keyring found at /var/lib/ceph/osd/ceph-31/keyring, disabling cephx
2021-10-12 17:50:23.117 7fce92e6ac00 -1 auth: unable to find a keyring 
on /var/lib/ceph/osd/ceph-31/keyring: (2) No such file or directory
2021-10-12 17:50:23.117 7fce92e6ac00 -1 AuthRegistry(0x7ffe9b64eb08) no 
keyring found at /var/lib/ceph/osd/ceph-31/keyring, disabling cephx
failed to fetch mon config (--no-mon-config to skip)

No tmpfs mounts exist for any directories in /var/lib/ceph/osd/ceph-**

Any assistance helping with this situation would be greatly appreciated.

Thank you,
Todd
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux