Re: Noob install: "rbd pool init" stuck

Eugen Block <eblock@xxxxxx> · Wed, 19 Oct 2022 13:35:06 +0000

Hi,

you mentioned three servers, you'll need those before your crush rule  
can be applied because the default size is 3 and each PG has to be  
placed on three different hosts (which you currently don't have).  
There are a couple of ways to let the pool creation finish, but I  
recommend to add the other OSD severs first.
It would be pretty bad if ceph would use the OS disk, wouldn't it? ;-)  
You can see which devices are available from ceph's point of view:

cephadm ceph-volume inventory

Having one yaml file doesn't seem overkill compared to managing 12  
OSDs manually to me. ;-) As I already wrote, you might not even need a  
yaml file if the OSD layout stays like this (all OSDs are standalone  
OSDs without any separate rocksDB devices), you could let  
"all-available-devices" be in "managed" state. That would deploy new  
OSDs automatically in case you replace one (or wipe one). But you'll  
find the most suitable way for you, it was just a suggestion.

Zitat von Renato Callado Borges <renato.callado@xxxxxxxxxxxx>:

Hi Eugen!

How are you?

Thank you for your help!

# ceph osd tree

ID  CLASS  WEIGHT     TYPE NAME           STATUS  REWEIGHT  PRI-AFF
-1         174.62640  root default
-3         174.62640      host darkside2
 0    hdd   14.55220          osd.0           up   1.00000  1.00000
 1    hdd   14.55220          osd.1           up   1.00000  1.00000
 2    hdd   14.55220          osd.2           up   1.00000  1.00000
 3    hdd   14.55220          osd.3           up   1.00000  1.00000
 4    hdd   14.55220          osd.4           up   1.00000  1.00000
 5    hdd   14.55220          osd.5           up   1.00000  1.00000
 6    hdd   14.55220          osd.6           up   1.00000  1.00000
 7    hdd   14.55220          osd.7           up   1.00000  1.00000
 8    hdd   14.55220          osd.8           up   1.00000  1.00000
 9    hdd   14.55220          osd.9           up   1.00000  1.00000
10    hdd   14.55220          osd.10          up   1.00000  1.00000
11    hdd   14.55220          osd.11          up   1.00000  1.00000

# ceph osd crush rule dump replicated_rule
{
    "rule_id": 0,
    "rule_name": "replicated_rule",
    "ruleset": 0,
    "type": 1,
    "min_size": 1,
    "max_size": 10,
    "steps": [
        {
            "op": "take",
            "item": -1,
            "item_name": "default"
        },
        {
            "op": "chooseleaf_firstn",
            "num": 0,
            "type": "host"
        },
        {
            "op": "emit"
        }
    ]
}

I read about --all-available-devices, but I considered that there  
would be a slight chance it would add the system disks (two raid1  
HDDs). So I went the route of adding manually the 'storage' HDDs.  
Regarding the yaml, it seemed overkill.

But perhaps you are mentioning this because --all-available-devices  
does some legwork invisibly? And it would be more sensible for me to  
backtrack everything and then run this automated command?

Cordially,
Renata.

On 18/10/2022 12:40, Eugen Block wrote:
Hi,

the command doesn't return because your PGs are inactive. It looks  
like you're trying to use the default replicated_rule but it can't  
find a suitable placement. What does your 'ceph osd tree' look  
like? And also paste your ruleset ('ceph osd crush rule dump  
replicated_rule').
Regarding OSD management you could have simply let cephadm choose  
all available disks for you [1]:

ceph orch device ls
ceph orch apply osd --all-available-devices

Or create a service spec yaml file and simply run 'ceph orch apply  
-i osd-specs.yaml' once to deploy all OSDs at once on all target  
nodes from that yaml file.

[1] https://docs.ceph.com/en/latest/cephadm/services/osd/#deploy-osds
[2] https://docs.ceph.com/en/latest/cephadm/services/osd/#examples

Zitat von Renato Callado Borges <renato.callado@xxxxxxxxxxxx>:

Dear all,

I am deploying a Ceph system for the first time.

I have 3 servers where I intend to install 1 manager, 1 mon and 12  
OSDs in each.

Since they are used in production already, I selected a single  
machine to begin deployment, but got stuck when creating rbd pools.

The host OS is Centos 7, and cephadm allowed me to install Octopus.

These are the commands I have issued so far:

./cephadm add-repo --release octopus
./cephadm install ceph-common
cephadm bootstrap --mon-ip "X.X.X.X" # edited for privacy, real IP used.
ceph orch daemon add osd darkside2:/dev/sdb

This latest add command was repeated 12 times, once for each block  
device to be added to Ceph storage.

ceph osd pool create lgcmUnsafe 128 128

Until here, everything seemed fine, no error messages on  
journalctl or on /var/log/ceph/cephadm.log. I have run ceph status  
after each command and the output seemed consistent.

This command, though, gets stuck forever, with no error or warning  
message anywhere:

rbd pool init lgcmUnsafe

I have canceled the command with ctrl+c and issued ceph status.  
This is the output:

  cluster:
    id:     1902a026-496d-11ed-b43e-08c0eb320ec2
    health: HEALTH_WARN
            Reduced data availability: 128 pgs inactive
            Degraded data redundancy: 128 pgs undersized

  services:
    mon: 1 daemons, quorum darkside2 (age 19h)
    mgr: darkside2.umccvh(active, since 19h)
    osd: 12 osds: 12 up (since 19h), 12 in (since 4d); 1 remapped pgs

  data:
    pools:   2 pools, 129 pgs
    objects: 13 objects, 0 B
    usage:   12 GiB used, 175 TiB / 175 TiB avail
    pgs:     99.225% pgs not active
             26/39 objects misplaced (66.667%)
             128 undersized+peered
             1   active+clean+remapped

Could someone more knowledgeable help me debug this, please?  
Thanks in advance!

Cordially,
Renata.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx