Re: Ceph-Deploy error on 15/71 stage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Eugen.

Sorry for the delay in answering.

Just looked in the /var/log/ceph/ directory. It only contains the following files (for example on node01):

#######
# ls -lart
total 3864
-rw------- 1 ceph ceph     904 ago 24 13:11 ceph.audit.log-20180829.xz
drwxr-xr-x 1 root root     898 ago 28 10:07 ..
-rw-r--r-- 1 ceph ceph  189464 ago 28 23:59 ceph-mon.node01.log-20180829.xz
-rw------- 1 ceph ceph   24360 ago 28 23:59 ceph.log-20180829.xz
-rw-r--r-- 1 ceph ceph   48584 ago 29 00:00 ceph-mgr.node01.log-20180829.xz
-rw------- 1 ceph ceph       0 ago 29 00:00 ceph.audit.log
drwxrws--T 1 ceph ceph     352 ago 29 00:00 .
-rw-r--r-- 1 ceph ceph 1908122 ago 29 12:46 ceph-mon.node01.log
-rw------- 1 ceph ceph  175229 ago 29 12:48 ceph.log
-rw-r--r-- 1 ceph ceph 1599920 ago 29 12:49 ceph-mgr.node01.log
#######

So, it only contains logs concerning the node itself (is it correct? sincer node01 is also the master, I was expecting it to have logs from the other too) and, moreover, no ceph-osd* files. Also, I'm looking the logs I have available, and nothing "shines out" (sorry for my poor english) as a possible error.

Any suggestion on how to proceed?

Thanks a lot in advance,

Jones


On Mon, Aug 27, 2018 at 5:29 AM Eugen Block <eblock@xxxxxx> wrote:
Hi Jones,

all ceph logs are in the directory /var/log/ceph/, each daemon has its 
own log file, e.g. OSD logs are named ceph-osd.*.

I haven't tried it but I don't think SUSE Enterprise Storage deploys 
OSDs on partitioned disks. Is there a way to attach a second disk to 
the OSD nodes, maybe via USB or something?

Although this thread is ceph related it is referring to a specific 
product, so I would recommend to post your question in the SUSE forum 
[1].

Regards,
Eugen

[1] https://forums.suse.com/forumdisplay.php?99-SUSE-Enterprise-Storage

Zitat von Jones de Andrade <johannesrs@xxxxxxxxx>:

> Hi Eugen.
>
> Thanks for the suggestion. I'll look for the logs (since it's our first
> attempt with ceph, I'll have to discover where they are, but no problem).
>
> One thing called my attention on your response however:
>
> I haven't made myself clear, but one of the failures we encountered were
> that the files now containing:
>
> node02:
>    ----------
>    storage:
>        ----------
>        osds:
>            ----------
>            /dev/sda4:
>                ----------
>                format:
>                    bluestore
>                standalone:
>                    True
>
> Were originally empty, and we filled them by hand following a model found
> elsewhere on the web. It was necessary, so that we could continue, but the
> model indicated that, for example, it should have the path for /dev/sda
> here, not /dev/sda4. We chosen to include the specific partition
> identification because we won't have dedicated disks here, rather just the
> very same partition as all disks were partitioned exactly the same.
>
> While that was enough for the procedure to continue at that point, now I
> wonder if it was the right call and, if it indeed was, if it was done
> properly.  As such, I wonder: what you mean by "wipe" the partition here?
> /dev/sda4 is created, but is both empty and unmounted: Should a different
> operation be performed on it, should I remove it first, should I have
> written the files above with only /dev/sda as target?
>
> I know that probably I wouldn't run in this issues with dedicated discks,
> but unfortunately that is absolutely not an option.
>
> Thanks a lot in advance for any comments and/or extra suggestions.
>
> Sincerely yours,
>
> Jones
>
> On Sat, Aug 25, 2018 at 5:46 PM Eugen Block <eblock@xxxxxx> wrote:
>
>> Hi,
>>
>> take a look into the logs, they should point you in the right direction.
>> Since the deployment stage fails at the OSD level, start with the OSD
>> logs. Something's not right with the disks/partitions, did you wipe
>> the partition from previous attempts?
>>
>> Regards,
>> Eugen
>>
>> Zitat von Jones de Andrade <johannesrs@xxxxxxxxx>:
>>
>>> (Please forgive my previous email: I was using another message and
>>> completely forget to update the subject)
>>>
>>> Hi all.
>>>
>>> I'm new to ceph, and after having serious problems in ceph stages 0, 1
>> and
>>> 2 that I could solve myself, now it seems that I have hit a wall harder
>>> than my head. :)
>>>
>>> When I run salt-run state.orch ceph.stage.deploy, i monitor I see it
>> going
>>> up to here:
>>>
>>> #######
>>> [14/71]   ceph.sysctl on
>>>           node01....................................... ✓ (0.5s)
>>>           node02........................................ ✓ (0.7s)
>>>           node03....................................... ✓ (0.6s)
>>>           node04......................................... ✓ (0.5s)
>>>           node05....................................... ✓ (0.6s)
>>>           node06.......................................... ✓ (0.5s)
>>>
>>> [15/71]   ceph.osd on
>>>           node01...................................... ❌ (0.7s)
>>>           node02........................................ ❌ (0.7s)
>>>           node03....................................... ❌ (0.7s)
>>>           node04......................................... ❌ (0.6s)
>>>           node05....................................... ❌ (0.6s)
>>>           node06.......................................... ❌ (0.7s)
>>>
>>> Ended stage: ceph.stage.deploy succeeded=14/71 failed=1/71 time=624.7s
>>>
>>> Failures summary:
>>>
>>> ceph.osd (/srv/salt/ceph/osd):
>>>   node02:
>>>     deploy OSDs: Module function osd.deploy threw an exception.
>> Exception:
>>> Mine on node02 for cephdisks.list
>>>   node03:
>>>     deploy OSDs: Module function osd.deploy threw an exception.
>> Exception:
>>> Mine on node03 for cephdisks.list
>>>   node01:
>>>     deploy OSDs: Module function osd.deploy threw an exception.
>> Exception:
>>> Mine on node01 for cephdisks.list
>>>   node04:
>>>     deploy OSDs: Module function osd.deploy threw an exception.
>> Exception:
>>> Mine on node04 for cephdisks.list
>>>   node05:
>>>     deploy OSDs: Module function osd.deploy threw an exception.
>> Exception:
>>> Mine on node05 for cephdisks.list
>>>   node06:
>>>     deploy OSDs: Module function osd.deploy threw an exception.
>> Exception:
>>> Mine on node06 for cephdisks.list
>>> #######
>>>
>>> Since this is a first attempt in 6 simple test machines, we are going to
>>> put the mon, osds, etc, in all nodes at first. Only the master is left
>> in a
>>> single machine (node01) by now.
>>>
>>> As they are simple machines, they have a single hdd, which is partitioned
>>> as follows (the hda4 partition is unmounted and left for the ceph
>> system):
>>>
>>> ###########
>>> # lsblk
>>> NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
>>> sda      8:0    0 465,8G  0 disk
>>> ├─sda1   8:1    0   500M  0 part /boot/efi
>>> ├─sda2   8:2    0    16G  0 part [SWAP]
>>> ├─sda3   8:3    0  49,3G  0 part /
>>> └─sda4   8:4    0   400G  0 part
>>> sr0     11:0    1   3,7G  0 rom
>>>
>>> # salt -I 'roles:storage' cephdisks.list
>>> node01:
>>> node02:
>>> node03:
>>> node04:
>>> node05:
>>> node06:
>>>
>>> # salt -I 'roles:storage' pillar.get ceph
>>> node02:
>>>     ----------
>>>     storage:
>>>         ----------
>>>         osds:
>>>             ----------
>>>             /dev/sda4:
>>>                 ----------
>>>                 format:
>>>                     bluestore
>>>                 standalone:
>>>                     True
>>> (and so on for all 6 machines)
>>> ##########
>>>
>>> Finally and just in case, my policy.cfg file reads:
>>>
>>> #########
>>> #cluster-unassigned/cluster/*.sls
>>> cluster-ceph/cluster/*.sls
>>> profile-default/cluster/*.sls
>>> profile-default/stack/default/ceph/minions/*yml
>>> config/stack/default/global.yml
>>> config/stack/default/ceph/cluster.yml
>>> role-master/cluster/node01.sls
>>> role-admin/cluster/*.sls
>>> role-mon/cluster/*.sls
>>> role-mgr/cluster/*.sls
>>> role-mds/cluster/*.sls
>>> role-ganesha/cluster/*.sls
>>> role-client-nfs/cluster/*.sls
>>> role-client-cephfs/cluster/*.sls
>>> ##########
>>>
>>> Please, could someone help me and shed some light on this issue?
>>>
>>> Thanks a lot in advance,
>>>
>>> Regasrds,
>>>
>>> Jones
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux