Re: Ceph octopus version cluster not starting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Frank,

with Manual command I was able to start mon and able to see logs in log
file and I don't find any issue in logs except below lines.
Should I stop manual command and try to start mon service from systemd or
follow the same approach in all mon nodes?

2024-09-16T15:36:54.620+0530 7f5783d1e5c0  4 rocksdb:
> [db/version_set.cc:3757] Recovered from manifest
> file:/var/lib/ceph/mon/node/store.db/MANIFEST-4328236 s
> ucceeded,manifest_file_number is 4328236, next_file_number is 4328238,
> last_sequence is 1782572963, log_number is 4328223,prev_log_number is
> 0,max_column_family is 0,mi
> n_log_number_to_keep is 0
>
> 2024-09-16T15:36:54.620+0530 7f5783d1e5c0  4 rocksdb:
> [db/version_set.cc:3766] Column family [default] (ID 0), log number is
> 4328223
>
> 2024-09-16T15:36:54.620+0530 7f5783d1e5c0  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1726481214623513, "job": 1, "event": "recovery_started",
> "log_files": [4328237]}
> 2024-09-16T15:36:54.620+0530 7f5783d1e5c0  4 rocksdb:
> [db/db_impl_open.cc:583] Recovering log #4328237 mode 2
> 2024-09-16T15:36:54.620+0530 7f5783d1e5c0  4 rocksdb:
> [db/version_set.cc:3036] Creating manifest 4328239
>
> 2024-09-16T15:36:54.620+0530 7f5783d1e5c0  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1726481214625473, "job": 1, "event": "recovery_finished"}
> 2024-09-16T15:36:54.628+0530 7f5783d1e5c0  4 rocksdb: DB pointer
> 0x561bb7e90000
>



On Mon, Sep 16, 2024 at 2:22 PM Frank Schilder <frans@xxxxxx> wrote:

> Hi. When I have issues like this, what sometimes helps is to start a
> daemon manually (not systemctl or anything like that). Make sure no
> ceph-mon is running on the host:
>
> ps -eo cmd | grep ceph-mon
>
> and start a ceph-mon manually with a command like this (make sure the
> binary is the correct version):
>
> /usr/bin/ceph-mon --cluster ceph --setuser ceph --setgroup ceph
> --foreground -i MON-NAME --mon-data /var/lib/ceph/mon/STORE --public-addr
> MON-IP
>
> Depending on your debug settings, this command does output a bit on
> startup. If your settings in ceph.conf are 0/0, I think you can override
> this on the command line. It might be useful to set the option "-d" (debug
> mode with "log to stderr") on the command line as well. With defaults it
> will talk at least about opening the store and then just wait or complain
> that there are no peers.
>
> This is a good sign.
>
> If you got one MON running, start another one on another host and so on
> until you have enough up for quorum. Then you can start querying the MONs
> what their problem is.
>
> If none of this works, the output of the manual command maybe with higher
> debug settings on the command line should be helpful.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Amudhan P <amudhan83@xxxxxxxxx>
> Sent: Monday, September 16, 2024 10:36 AM
> To: Eugen Block
> Cc: ceph-users@xxxxxxx
> Subject:  Re: Ceph octopus version cluster not starting
>
> No, I don't use cephadm and I have enough space for a log storage.
>
> When I try to start mon service in any of the node it just keeps waiting to
> complete without any error msg in stdout or in log file.
>
> On Mon, Sep 16, 2024 at 1:21 PM Eugen Block <eblock@xxxxxx> wrote:
>
> > Hi,
> >
> > I would focus on the MONs first. If they don't start, your cluster is
> > not usable. It doesn't look like you use cephadm, but please confirm.
> > Check if the nodes are running out of disk space, maybe that's why
> > they don't log anything and fail to start.
> >
> >
> > Zitat von Amudhan P <amudhan83@xxxxxxxxx>:
> >
> > > Hi,
> > >
> > > Recently added one disk in Ceph cluster using "ceph-volume lvm create
> > > --data /dev/sdX" but the new OSD didn't start. After some rest of the
> > other
> > > nodes OSD service also stopped. So, I restarted all nodes in the
> cluster
> > > now after restart.
> > > MON, MDS, MGR  and OSD services are not starting. Could find any new
> logs
> > > also after restart it is totally silent in all nodes.
> > > Could find some logs in Ceph-volume service.
> > >
> > >
> > > Error in Ceph-volume logs :-
> > > [2024-09-15 23:38:15,080][ceph_volume.process][INFO  ] stderr Running
> > > command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-5
> > > --> Executable selinuxenabled not in PATH:
> > > /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
> > > Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-5
> > > Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph
> > prime-osd-dir
> > > --dev
> > >
> >
> /dev/ceph-33cd42cd-8570-47de-8703-d7cab1acf2ae/osd-block-21968433-bb53-4415-b9e2-fdc36bc4a28e
> > > --path /var/lib/ceph/osd/ceph-5 --no-mon-config
> > >  stderr: failed to read label for
> > >
> >
> /dev/ceph-33cd42cd-8570-47de-8703-d7cab1acf2ae/osd-block-21968433-bb53-4415-b9e2-fdc36bc4a28e:
> > > (2) No such file or directory
> > > 2024-09-15T23:38:15.059+0530 7fe7767c8100 -1
> > >
> >
> bluestore(/dev/ceph-33cd42cd-8570-47de-8703-d7cab1acf2ae/osd-block-21968433-bb53-4415-b9e2-fdc36bc4a28e)
> > > _read_bdev_label failed to open
> > >
> >
> /dev/ceph-33cd42cd-8570-47de-8703-d7cab1acf2ae/osd-block-21968433-bb53-4415-b9e2-fdc36bc4a28e:
> > > (2) No such file or directory
> > > -->  RuntimeError: command returned non-zero exit status: 1
> > > [2024-09-15 23:38:15,084][ceph_volume.process][INFO  ] stderr Running
> > > command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-2
> > > --> Executable selinuxenabled not in PATH:
> > > /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
> > > Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-2
> > > Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph
> > prime-osd-dir
> > > --dev
> > >
> >
> /dev/ceph-9a9b8328-66ad-4997-8b9f-5216b56b73e8/osd-block-ac2ae41d-3b77-4bfd-ba5c-737e4266e988
> > > --path /var/lib/ceph/osd/ceph-2 --no-mon-config
> > >  stderr: failed to read label for
> > >
> >
> /dev/ceph-9a9b8328-66ad-4997-8b9f-5216b56b73e8/osd-block-ac2ae41d-3b77-4bfd-ba5c-737e4266e988:
> > > (2) No such file or directory
> > >
> > > But I could find "
> > >
> >
> /dev/ceph-9a9b8328-66ad-4997-8b9f-5216b56b73e8/osd-block-ac2ae41d-3b77-4bfd-ba5c-737e4266e988"
> > > the path valid and listing folder.
> > >
> > > Not sure how to proceed or where to start any idea or suggestion ?
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux