Re: Ceph-Deploy error on 15/71 stage

Eugen Block <eblock@xxxxxx> · Mon, 27 Aug 2018 08:29:45 +0000

Hi Jones,

all ceph logs are in the directory /var/log/ceph/, each daemon has its  
own log file, e.g. OSD logs are named ceph-osd.*.

I haven't tried it but I don't think SUSE Enterprise Storage deploys  
OSDs on partitioned disks. Is there a way to attach a second disk to  
the OSD nodes, maybe via USB or something?

Although this thread is ceph related it is referring to a specific  
product, so I would recommend to post your question in the SUSE forum  
[1].

Regards,
Eugen

[1] https://forums.suse.com/forumdisplay.php?99-SUSE-Enterprise-Storage

Zitat von Jones de Andrade <johannesrs@xxxxxxxxx>:

Hi Eugen.

Thanks for the suggestion. I'll look for the logs (since it's our first
attempt with ceph, I'll have to discover where they are, but no problem).

One thing called my attention on your response however:

I haven't made myself clear, but one of the failures we encountered were
that the files now containing:

node02:
   ----------
   storage:
       ----------
       osds:
           ----------
           /dev/sda4:
               ----------
               format:
                   bluestore
               standalone:
                   True

Were originally empty, and we filled them by hand following a model found
elsewhere on the web. It was necessary, so that we could continue, but the
model indicated that, for example, it should have the path for /dev/sda
here, not /dev/sda4. We chosen to include the specific partition
identification because we won't have dedicated disks here, rather just the
very same partition as all disks were partitioned exactly the same.

While that was enough for the procedure to continue at that point, now I
wonder if it was the right call and, if it indeed was, if it was done
properly.  As such, I wonder: what you mean by "wipe" the partition here?
/dev/sda4 is created, but is both empty and unmounted: Should a different
operation be performed on it, should I remove it first, should I have
written the files above with only /dev/sda as target?

I know that probably I wouldn't run in this issues with dedicated discks,
but unfortunately that is absolutely not an option.

Thanks a lot in advance for any comments and/or extra suggestions.

Sincerely yours,

Jones

On Sat, Aug 25, 2018 at 5:46 PM Eugen Block <eblock@xxxxxx> wrote:

Hi,

take a look into the logs, they should point you in the right direction.
Since the deployment stage fails at the OSD level, start with the OSD
logs. Something's not right with the disks/partitions, did you wipe
the partition from previous attempts?

Regards,
Eugen

Zitat von Jones de Andrade <johannesrs@xxxxxxxxx>:

(Please forgive my previous email: I was using another message and
completely forget to update the subject)

Hi all.

I'm new to ceph, and after having serious problems in ceph stages 0, 1
and
2 that I could solve myself, now it seems that I have hit a wall harder
than my head. :)

When I run salt-run state.orch ceph.stage.deploy, i monitor I see it
going
up to here:

#######
[14/71]   ceph.sysctl on
          node01....................................... ✓ (0.5s)
          node02........................................ ✓ (0.7s)
          node03....................................... ✓ (0.6s)
          node04......................................... ✓ (0.5s)
          node05....................................... ✓ (0.6s)
          node06.......................................... ✓ (0.5s)

[15/71]   ceph.osd on
          node01...................................... ❌ (0.7s)
          node02........................................ ❌ (0.7s)
          node03....................................... ❌ (0.7s)
          node04......................................... ❌ (0.6s)
          node05....................................... ❌ (0.6s)
          node06.......................................... ❌ (0.7s)

Ended stage: ceph.stage.deploy succeeded=14/71 failed=1/71 time=624.7s

Failures summary:

ceph.osd (/srv/salt/ceph/osd):
  node02:
    deploy OSDs: Module function osd.deploy threw an exception.
Exception:
Mine on node02 for cephdisks.list
  node03:
    deploy OSDs: Module function osd.deploy threw an exception.
Exception:
Mine on node03 for cephdisks.list
  node01:
    deploy OSDs: Module function osd.deploy threw an exception.
Exception:
Mine on node01 for cephdisks.list
  node04:
    deploy OSDs: Module function osd.deploy threw an exception.
Exception:
Mine on node04 for cephdisks.list
  node05:
    deploy OSDs: Module function osd.deploy threw an exception.
Exception:
Mine on node05 for cephdisks.list
  node06:
    deploy OSDs: Module function osd.deploy threw an exception.
Exception:
Mine on node06 for cephdisks.list
#######

Since this is a first attempt in 6 simple test machines, we are going to
put the mon, osds, etc, in all nodes at first. Only the master is left
in a
single machine (node01) by now.

As they are simple machines, they have a single hdd, which is partitioned
as follows (the hda4 partition is unmounted and left for the ceph
system):

###########
# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 465,8G  0 disk
├─sda1   8:1    0   500M  0 part /boot/efi
├─sda2   8:2    0    16G  0 part [SWAP]
├─sda3   8:3    0  49,3G  0 part /
└─sda4   8:4    0   400G  0 part
sr0     11:0    1   3,7G  0 rom

# salt -I 'roles:storage' cephdisks.list
node01:
node02:
node03:
node04:
node05:
node06:

# salt -I 'roles:storage' pillar.get ceph
node02:
    ----------
    storage:
        ----------
        osds:
            ----------
            /dev/sda4:
                ----------
                format:
                    bluestore
                standalone:
                    True
(and so on for all 6 machines)
##########

Finally and just in case, my policy.cfg file reads:

#########
#cluster-unassigned/cluster/*.sls
cluster-ceph/cluster/*.sls
profile-default/cluster/*.sls
profile-default/stack/default/ceph/minions/*yml
config/stack/default/global.yml
config/stack/default/ceph/cluster.yml
role-master/cluster/node01.sls
role-admin/cluster/*.sls
role-mon/cluster/*.sls
role-mgr/cluster/*.sls
role-mds/cluster/*.sls
role-ganesha/cluster/*.sls
role-client-nfs/cluster/*.sls
role-client-cephfs/cluster/*.sls
##########

Please, could someone help me and shed some light on this issue?

Thanks a lot in advance,

Regasrds,

Jones

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com