Re: [EXTERNAL] [Pacific] ceph orch device ls do not returns any HDD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Patrick,

Sorry for delayed response.  This seems to be the limit of assistance I’m capable of providing.  My deployments are all ubuntu and bootstrapped (or upgraded) according to this starting doc:
https://docs.ceph.com/en/quincy/cephadm/install/#cephadm-deploying-new-cluster

It is very confusing to me that cephadm and ceph-volume are able to zap the device, but cephadm ceph-volume inventory shows nothing.  It’s even more perplexing to me, because on my systems even the OS disks are listed as not available.

Maybe someone else here as an idea what’s going on.

One last difference that might be a place for you to investigate.  I’m using docker, so perhaps your podman installation is somehow limiting direct access to the disk devices?

Best of luck,
Josh Beaman

From: Patrick Begou <Patrick.Begou@xxxxxxxxxxxxxxxxxxxxxx>
Date: Saturday, May 13, 2023 at 3:33 AM
To: Beaman, Joshua <Joshua_Beaman@xxxxxxxxxxx>, ceph-users <ceph-users@xxxxxxx>
Subject: Re: [EXTERNAL]  [Pacific] ceph orch device ls do not returns any HDD
Hi Joshua,

I've tried these commands but it looks like CEPH is unable to see and configure these HDDs.
[root@mostha1 ~]# cephadm ceph-volume inventory
Inferring fsid 4b7a6504-f0be-11ed-be1a-00266cf8869c
Using recent ceph image quay.io/ceph/ceph@sha256:e6919776f0ff8331a8e9c4b18d36c5e9eed31e1a80da62ae8454e42d10e95544

Device Path               Size         Device nodes    rotates available Model name
[root@mostha1 ~]# cephadm shell
[ceph: root@mostha1 /]# ceph orch apply osd --all-available-devices
Scheduled osd.all-available-devices update...
[ceph: root@mostha1 /]# ceph orch device ls[ceph: root@mostha1 /]# ceph-volume lvm zap /dev/sdb
--> Zapping: /dev/sdb
--> --destroy was not specified, but zapping a whole device will remove the partition table
Running command: /usr/bin/dd if=/dev/zero of=/dev/sdb bs=1M count=10 conv=fsync
 stderr: 10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.10039 s, 104 MB/s
--> Zapping successful for: <Raw Device: /dev/sdb>
I can check that /dev/sdb1 has been erased, so previous command is successful
[ceph: root@mostha1 ceph]# lsblk
NAME                 MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                    8:0    1 232.9G  0 disk
|-sda1                 8:1    1   3.9G  0 part /rootfs/boot
|-sda2                 8:2    1  78.1G  0 part
| `-osvg-rootvol     253:0    0  48.8G  0 lvm  /rootfs
|-sda3                 8:3    1   3.9G  0 part [SWAP]
`-sda4                 8:4    1 146.9G  0 part
  |-secretvg-homevol 253:1    0   9.8G  0 lvm  /rootfs/home
  |-secretvg-tmpvol  253:2    0   9.8G  0 lvm  /rootfs/tmp
  `-secretvg-varvol  253:3    0   9.8G  0 lvm  /rootfs/var
sdb                    8:16   1 465.8G  0 disk
sdc                    8:32   1 232.9G  0 disk

But still no visible HDD:

[ceph: root@mostha1 ceph]# ceph orch apply osd --all-available-devices
Scheduled osd.all-available-devices update...
[ceph: root@mostha1 ceph]# ceph orch device ls
[ceph: root@mostha1 ceph]#

May be I have done something bad at install time as in the container I've unintentionally run:

dnf -y install    https://download.ceph.com/rpm-16.2.13/el8/noarch/cephadm-16.2.13-0.el8.noarch.rpm<https://urldefense.com/v3/__https:/download.ceph.com/rpm-16.2.13/el8/noarch/cephadm-16.2.13-0.el8.noarch.rpm__;!!CQl3mcHX2A!EzJ_6vp-S6Fayh-fCLppxQsGfFmUSxK4V5TxRi3N--Q1l-YLs8Rk4rnfayvtZ465I5fR00cktLGWbCUHR_xnEAgQ8mYXa5O6GcP00naSfA$>

(an awful copy/paste launching the command). Can this break The container ? I do not know what should be available as ceph packages in the container to remove properly this install (no dnf.log file in the container)

Patrick


Le 12/05/2023 à 21:38, Beaman, Joshua a écrit :

The most significant point I see there, is you have no OSD service spec to tell orchestrator how to deploy OSDs.  The easiest fix for that would be “ceph orch apply osd --all-available-devices”
This will create a simple spec that should work for a test environment.  Most likely it will collocate the block, block.db, and WAL all on the same device.  Not ideal for prod environments, but fine for practice and testing.

The other command I should have had you try is “cephadm ceph-volume inventory”.  That should show you the devices available for OSD deployment, and hopefully matches up to what your “lsblk” shows.  If you need to zap HDDs and orchestrator is still not seeing them, you can try “cephadm ceph-volume lvm zap /dev/sdb”

Thank you,
Josh Beaman

From: Patrick Begou <Patrick.Begou@xxxxxxxxxxxxxxxxxxxxxx><mailto:Patrick.Begou@xxxxxxxxxxxxxxxxxxxxxx>
Date: Friday, May 12, 2023 at 2:22 PM
To: Beaman, Joshua <Joshua_Beaman@xxxxxxxxxxx><mailto:Joshua_Beaman@xxxxxxxxxxx>, ceph-users <ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx>
Subject: Re: [EXTERNAL]  [Pacific] ceph orch device ls do not returns any HDD
Hi Joshua and thanks for this quick reply.

At this step I have only one node. I was checking what ceph was returning with different commands on this host before adding new hosts. Just to compare with my first Octopus install. As this hardware is for testing only, it remains easy for me to break everything and reinstall again.

[root@mostha1 ~]# cephadm check-host
podman (/usr/bin/podman) version 4.2.0 is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
[ceph: root@mostha1 /]# ceph -s
  cluster:
    id:     4b7a6504-f0be-11ed-be1a-00266cf8869c
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 1 daemons, quorum mostha1.legi.grenoble-inp.fr (age 5h)
    mgr: mostha1.legi.grenoble-inp.fr.hogwuz(active, since 5h)
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:
[ceph: root@mostha1 /]# ceph orch ls
NAME           PORTS        RUNNING  REFRESHED  AGE  PLACEMENT
alertmanager   ?:9093,9094      1/1  6m ago     6h   count:1
crash                           1/1  6m ago     6h   *
grafana        ?:3000           1/1  6m ago     6h   count:1
mgr                             1/2  6m ago     6h   count:2
mon                             1/5  6m ago     6h   count:5
node-exporter  ?:9100           1/1  6m ago     6h   *
prometheus     ?:9095           1/1  6m ago     6h   count:1
[ceph: root@mostha1 /]# ceph orch ls osd -export
No services reported
[ceph: root@mostha1 /]# ceph orch host ls
HOST                          ADDR           LABELS  STATUS
mostha1.legi.grenoble-inp.fr  194.254.66.34  _admin
1 hosts in cluster
[ceph: root@mostha1 /]# ceph log last cephadm
...
2023-05-12T15:19:58.754655+0000 mgr.mostha1.legi.grenoble-inp.fr.hogwuz (mgr.44098) 1876 : cephadm [INF] Zap device mostha1.legi.grenoble-inp.fr:/dev/sdb
2023-05-12T15:19:58.756639+0000 mgr.mostha1.legi.grenoble-inp.fr.hogwuz (mgr.44098) 1877 : cephadm [ERR] Device path '/dev/sdb' not found on host 'mostha1.legi.grenoble-inp.fr'
Traceback (most recent call last):
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 125, in wrapper
    return OrchResult(f(*args, **kwargs))
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2275, in zap_device
    f"Device path '{path}' not found on host '{host}'")
orchestrator._interface.OrchestratorError: Device path '/dev/sdb' not found on host 'mostha1.legi.grenoble-inp.fr'
....
[ceph: root@mostha1 /]# ls -l /dev/sdb
brw-rw---- 1 root disk 8, 16 May 12 15:16 /dev/sdb
[ceph: root@mostha1 /]# lsblk /dev/sdb
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sdb      8:16   1 465.8G  0 disk
`-sdb1   8:17   1 465.8G  0 part
I have crated a full partition on /dev/sdb (for testing) and /dev/sdc has no partition table (removed).

But all seams fine with these commands.

Patrick

Le 12/05/2023 à 20:19, Beaman, Joshua a écrit :
I don’t quite understand why that zap would not work.  But, here’s where I’d start.


  1.  cephadm check-host

     *   Run this on each of your hosts to make sure cephadm, podman and all other prerequisites are installed and recognized

  1.  ceph orch ls

     *   This should show at least a mon, mgr, and osd spec deployed

  1.  ceph orch ls osd –export

     *   This will show the OSD placement service specifications that orchestrator uses to identify devices to deploy as OSDs

  1.  ceph orch host ls

     *   This will list the hosts that have been added to orchestrator’s inventory, and what labels are applied which correlate to the service placement labels

  1.  ceph log last cephadm

     *   This will show you what orchestrator has been trying to do, and how it may be failing

Also, it’s never un-helpful to have a look at “ceph -s” and “ceph health detail”, particularly for any people trying to help you without access to your systems.

Best of luck,
Josh Beaman

From: Patrick Begou <Patrick.Begou@xxxxxxxxxxxxxxxxxxxxxx><mailto:Patrick.Begou@xxxxxxxxxxxxxxxxxxxxxx>
Date: Friday, May 12, 2023 at 10:45 AM
To: ceph-users <ceph-users@xxxxxxx><mailto:ceph-users@xxxxxxx>
Subject: [EXTERNAL]  [Pacific] ceph orch device ls do not returns any HDD
Hi everyone

I'm new to CEPH, just a french 4 days training session with Octopus on
VMs that convince me to build my first cluster.

At this time I have 4 old identical nodes for testing with 3 HDDs each,
2 network interfaces and running Alma Linux8 (el8). I try to replay the
training session but it fails, breaking the web interface because of
some problems with podman 4.2 not compatible with Octopus.

So I try to deploy Pacific with cephadm tool on my first node (mostha1)
(to enable testing also an upgrade later).

    dnf -y install
    https://urldefense.com/v3/__https://download.ceph.com/rpm-16.2.13/el8/noarch/cephadm-16.2.13-0.el8.noarch.rpm__;!!CQl3mcHX2A!H9cwNCJyKXYQ4BbGA3gwHHRitjOS4lBCZT9wlnBZ-8IDue0MvdcPD8Dnv5yQCZw_eA4BNDYaEq1eouKQcQO7HshgdUJ0SJ-EgLfaBGBmCQ$<https://urldefense.com/v3/__https:/download.ceph.com/rpm-16.2.13/el8/noarch/cephadm-16.2.13-0.el8.noarch.rpm__;!!CQl3mcHX2A!H9cwNCJyKXYQ4BbGA3gwHHRitjOS4lBCZT9wlnBZ-8IDue0MvdcPD8Dnv5yQCZw_eA4BNDYaEq1eouKQcQO7HshgdUJ0SJ-EgLfaBGBmCQ$>

    monip=$(getent ahostsv4 mostha1 |head -n 1| awk '{ print $1 }')
    cephadm bootstrap --mon-ip $monip --initial-dashboard-password xxxxx \
                       --initial-dashboard-user admceph \
                       --allow-fqdn-hostname --cluster-network 10.1.0.0/16

This was sucessfull.

But running "*c**eph orch device ls*" do not show any HDD even if I have
/dev/sda (used by the OS), /dev/sdb and /dev/sdc

The web interface shows a row capacity which is an aggregate of the
sizes of the 3 HDDs for the node.

I've also tried to reset /dev/sdb but cephadm do not see it:

    [ceph: root@mostha1 /]# ceph orch device zap
    mostha1.legi.grenoble-inp.fr /dev/sdb --force
    Error EINVAL: Device path '/dev/sdb' not found on host
    'mostha1.legi.grenoble-inp.fr'

On my first attempt with octopus, I was able to list the available HDD
with this command line. Before moving to Pacific, the OS on this node
has been reinstalled from scratch.

Any advices for a CEPH beginner ?

Thanks

Patrick
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx>




_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux