Re: Ceph octopus version cluster not starting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi. When I have issues like this, what sometimes helps is to start a daemon manually (not systemctl or anything like that). Make sure no ceph-mon is running on the host:

ps -eo cmd | grep ceph-mon

and start a ceph-mon manually with a command like this (make sure the binary is the correct version):

/usr/bin/ceph-mon --cluster ceph --setuser ceph --setgroup ceph --foreground -i MON-NAME --mon-data /var/lib/ceph/mon/STORE --public-addr MON-IP

Depending on your debug settings, this command does output a bit on startup. If your settings in ceph.conf are 0/0, I think you can override this on the command line. It might be useful to set the option "-d" (debug mode with "log to stderr") on the command line as well. With defaults it will talk at least about opening the store and then just wait or complain that there are no peers.

This is a good sign.

If you got one MON running, start another one on another host and so on until you have enough up for quorum. Then you can start querying the MONs what their problem is.

If none of this works, the output of the manual command maybe with higher debug settings on the command line should be helpful.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Amudhan P <amudhan83@xxxxxxxxx>
Sent: Monday, September 16, 2024 10:36 AM
To: Eugen Block
Cc: ceph-users@xxxxxxx
Subject:  Re: Ceph octopus version cluster not starting

No, I don't use cephadm and I have enough space for a log storage.

When I try to start mon service in any of the node it just keeps waiting to
complete without any error msg in stdout or in log file.

On Mon, Sep 16, 2024 at 1:21 PM Eugen Block <eblock@xxxxxx> wrote:

> Hi,
>
> I would focus on the MONs first. If they don't start, your cluster is
> not usable. It doesn't look like you use cephadm, but please confirm.
> Check if the nodes are running out of disk space, maybe that's why
> they don't log anything and fail to start.
>
>
> Zitat von Amudhan P <amudhan83@xxxxxxxxx>:
>
> > Hi,
> >
> > Recently added one disk in Ceph cluster using "ceph-volume lvm create
> > --data /dev/sdX" but the new OSD didn't start. After some rest of the
> other
> > nodes OSD service also stopped. So, I restarted all nodes in the cluster
> > now after restart.
> > MON, MDS, MGR  and OSD services are not starting. Could find any new logs
> > also after restart it is totally silent in all nodes.
> > Could find some logs in Ceph-volume service.
> >
> >
> > Error in Ceph-volume logs :-
> > [2024-09-15 23:38:15,080][ceph_volume.process][INFO  ] stderr Running
> > command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-5
> > --> Executable selinuxenabled not in PATH:
> > /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
> > Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-5
> > Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph
> prime-osd-dir
> > --dev
> >
> /dev/ceph-33cd42cd-8570-47de-8703-d7cab1acf2ae/osd-block-21968433-bb53-4415-b9e2-fdc36bc4a28e
> > --path /var/lib/ceph/osd/ceph-5 --no-mon-config
> >  stderr: failed to read label for
> >
> /dev/ceph-33cd42cd-8570-47de-8703-d7cab1acf2ae/osd-block-21968433-bb53-4415-b9e2-fdc36bc4a28e:
> > (2) No such file or directory
> > 2024-09-15T23:38:15.059+0530 7fe7767c8100 -1
> >
> bluestore(/dev/ceph-33cd42cd-8570-47de-8703-d7cab1acf2ae/osd-block-21968433-bb53-4415-b9e2-fdc36bc4a28e)
> > _read_bdev_label failed to open
> >
> /dev/ceph-33cd42cd-8570-47de-8703-d7cab1acf2ae/osd-block-21968433-bb53-4415-b9e2-fdc36bc4a28e:
> > (2) No such file or directory
> > -->  RuntimeError: command returned non-zero exit status: 1
> > [2024-09-15 23:38:15,084][ceph_volume.process][INFO  ] stderr Running
> > command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-2
> > --> Executable selinuxenabled not in PATH:
> > /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
> > Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-2
> > Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph
> prime-osd-dir
> > --dev
> >
> /dev/ceph-9a9b8328-66ad-4997-8b9f-5216b56b73e8/osd-block-ac2ae41d-3b77-4bfd-ba5c-737e4266e988
> > --path /var/lib/ceph/osd/ceph-2 --no-mon-config
> >  stderr: failed to read label for
> >
> /dev/ceph-9a9b8328-66ad-4997-8b9f-5216b56b73e8/osd-block-ac2ae41d-3b77-4bfd-ba5c-737e4266e988:
> > (2) No such file or directory
> >
> > But I could find "
> >
> /dev/ceph-9a9b8328-66ad-4997-8b9f-5216b56b73e8/osd-block-ac2ae41d-3b77-4bfd-ba5c-737e4266e988"
> > the path valid and listing folder.
> >
> > Not sure how to proceed or where to start any idea or suggestion ?
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux