Re: Newbie Requesting Help - Please, This Is Driving Me Mad/Crazy!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm not running octupus and I don't use the hard-core bare metal deployment method. I use ceph-volume and things work smoothly. Hence, my input might be useless.

Now looking at your text, you should always include the start-up and shut-down log of the OSD. As a wild guess, did you copy the OSD auth key to the required directory? Its somewhere in the instructions and I can't seem to find the copy command in your description.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: matthew@xxxxxxxxxxxxxxx <matthew@xxxxxxxxxxxxxxx>
Sent: 23 February 2021 06:09:52
To: ceph-users@xxxxxxx
Subject:  Newbie Requesting Help - Please, This Is Driving Me Mad/Crazy!

Hi Everyone,

Let me apologise upfront:

    If this isn't the correct List to post to
    If this has been answered already (& I've missed it in my searching)
    If this has ended up double posted
    If I've in any way given (or about to give) offence to anyone

I really need some help.

I'm trying to get a simple single host Pilot/Test Cluster up and running. I'm using CentOS 8 (fully updated), and Ceph-Octopus (latest version from the Ceph Repo). I have both ceph-mon and ceph-mgr working/running (although ceph-mge keeps stopping/crashing after about 1-3 hours or so - but that's another issue), and my first osd (and only osd at this point) *appears* to be working, but when I issue the command 'systemctl start ceph-osd@0' the ceph-osd daemon won't spin up and thus when I issue 'ceph -s' the result says the 'osd: 1 osds: 0 up, 0 in'.

I've gone through the relevant logs but I can't seem to find the issue.

I'm doing this as a Manual Install because I want to actually *learn* what's going on during the install/etc. I know I can use cephadmin (in a production environment), but as I said, I'm trying to learn how everything "fits together".

I've read and re-read the official Ceph Documentation and followed the following steps/commands to get Ceph installed and running:

Ran the following commands:
        su -
        useradd -d /home/ceph -m ceph -p <password>
        mkdir /home/ceph/.ssh

Added a public SSH Key to /home/ceph/.ssh/authorized_keys.

Ran the following commands:
        chmod 600 /home/ceph/.ssh/*
        chown ceph:ceph -R /home/ceph/.ssh

Added the ceph.repo details to /etc/yum.repos.d/ceph.repo (as per the Ceph Documentation).

Ran the following command:
        dnf -y install qemu-kvm qemu-guest-agent libvirt gdisk ceph

Created the /etc/ceph/ceph.conf file (see listing below).

Ran the following commands:
        ceph-authtool --create-keyring /etc/ceph/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'
        ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *'
        ceph-authtool --create-keyring /var/lib/ceph/bootstrap-osd/keyring --gen-key -n client.bootstrap-osd --cap mon 'profile bootstrap-osd' --cap mgr 'allow r'
        ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
        ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/keyring
        chown -R ceph:ceph /etc/ceph/
        chown -R ceph:ceph /var/lib/ceph/
        monmaptool --create --add ceph01 192.168.0.10 --fsid 98e84f97-031f-4958-bd54-22305f6bc738 /etc/ceph/monmap
        mkdir /var/lib/ceph/mon/ceph-ceph01
        chown -R ceph:ceph /var/lib/ceph
        sudo -u ceph ceph-mon --mkfs -i ceph01 --monmap /etc/ceph/monmap --keyring /etc/ceph/ceph.mon.keyring
        firewall-cmd --add-service=http --permanent
        firewall-cmd --add-service=ceph --permanent
        firewall-cmd --add-service=ceph-mon --permanent
        firewall-cmd --reload
        chmod -R 750 /var/lib/ceph/
        systemctl start ceph-mon@ceph01
        ceph mon enable-msgr2
        mkdir /var/lib/ceph/mgr/ceph-ceph01
        chown ceph:ceph /var/lib/ceph/mgr/ceph-ceph01
        ceph auth get-or-create mgr.ceph01 mon 'allow profile mgr' mds 'allow *' osd 'allow *' -o /var/lib/ceph/mgr/ceph-ceph01/keyring
        ceph-mgr -i ceph01

Fdisked 3 hdds (sdb, sdc, sdd) as GPT partitions.

Ran the following commands:
        mkfs.xfs /dev/sdb1
        mkfs.xfs /dev/sdc1
        mkfs.xfs /dev/sdd1
        mkdir -p /var/lib/ceph/osd/ceph-{0,1,2}
        chown -R ceph:ceph /var/lib/ceph/osd
        mount /dev/sdb1 /var/lib/ceph/osd/ceph-0
        mount /dev/sdc1 /var/lib/ceph/osd/ceph-1
        mount /dev/sdd1 /var/lib/ceph/osd/ceph-2

So, at this point everything is working, although 'ceph -s' does give a Health Warning about not having the required number of osds (as per the /etc/ceph/ceph.conf file).

Here is what I did to create and (fail to) run my first osd (osd.0):

Ran the following commands:
        sudo -u ceph ceph osd new $(uuidgen)
        sudo -u ceph ceph auth get-or-create osd.0 osd 'allow *' mon 'allow profile osd' mgr 'allow profile osd' -o /var/lib/ceph/osd/ceph-0/keyring
        sudo -u ceph ceph-osd -i 0 --mkfs
        ceph osd crush add 0 2 host=ceph01
        systemctl start ceph-osd@0

The osd shows up when I issue the command 'ceph osd ls'.

The key shows up when I issue the command 'ceph auth ls'.

But as I said above, when I issue the command 'ceph -s' it shows 'osd: 1 osds: 0 up, 0 in'.

And when I look at the systemctl status for ceph-osd@0 it simply said it failed with 'exit code'.

The /etc/ceph/ceph.conf listing:

[global]
        auth_client_required = cephx
        auth_cluster_required = cephx
        auth_service_required = cephx
        fsid = 98e84f97-031f-4958-bd54-22305f6bc738
        mon_host = ceph01
        public_network = 192.168.0.0/24

[mgr]
        mgr_initial_modules = dashboard alerts balancer restful status

[mgr.ceph01]
        log_file = /var/log/ceph/ceph-mgr.ceph01.log

[mon]
        mon_initial_members = ceph01
        mon_data_size_warn = 8589934592
        mon_allow_pool_delete = true

[mon.ceph01]
        host = ceph01
        mon_addr = 192.168.0.10
        log_file = /var/log/ceph/ceph-mon.ceph01.log

[osd]
        allow_ec_overwrites = true
        osd_crush_chooseleaf_type = 1
        osd_journal_size = 10240
        osd_pool_default_min_size = 2
        osd_pool_default_pg_num = 128
        osd_pool_default_pgp_num = 128
        osd_pool_default_size = 3
        osd_scrub_auto_repair = true
        osd_scrub_begin_hour = 3
        osd_scrub_end_hour = 11
        pg_autoscale_mode  = on

[osd.0]
        host = ceph01
        log_file = /var/log/ceph/ceph-osd.0.log

[osd.1]
        host = ceph01
        log_file = /var/log/ceph/ceph-osd.1.log

[osd.2]
        host = ceph01
        log_file = /var/log/ceph/ceph-osd.2.log


So, could someone please point out to me where I'm going wrong - I know its got to be something super-simple, but this has been driving me mad for over a week now.

Thanks in advance

Dulux-Oz
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux