Re: Orchestrator cephadm not setting CRUSH weight on OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

did you find an explanation for this?

I saw something similar on a customer's cluster. They reprovisioned OSDs (I don't know if any OSD-ID was reused) on one host with smaller disk sizes (size was changed through the raid controller to match the other hosts in that cluster) and they got their old crush weights (reflecting the old disk sizes). In Luminous I remember that changed reweights (not sure about crush weights) were stored somewhere in /var/run/ceph/ but that doesn't seem to be the case anymore and it also would be only relevant until a reboot. I'd also be interested where this information is stored in newer releases and why it's stored in the first place.

Regards,
Eugen


Zitat von Robert Sander <r.sander@xxxxxxxxxxxxxxxxxxx>:

Hi,

I stumbled across an issue where an OSD the gets redeployed has a CRUSH weight of 0 after cephadm finishes.

I have created a service definition for the orchestrator to automatically deploy OSDs on SSDs:

service_type: osd
service_id: SSD_OSDs
placement:
  label: 'osd'
data_devices:
  rotational: 0
  size: '100G'

These are my steps to reproduce this in a small test cluster running 15.2.4:

root@ceph01:~# ceph osd tree
ID   CLASS  WEIGHT   TYPE NAME            STATUS  REWEIGHT  PRI-AFF
 -1         1.63994  root default
-18         0.81995      rack rack10
 -3         0.40996          host ceph01
  8    hdd  0.10699              osd.8        up   1.00000  1.00000
  9    hdd  0.10699              osd.9        up   1.00000  1.00000
  0    ssd  0.09799              osd.0        up   1.00000  1.00000
  1    ssd  0.09798              osd.1        up   1.00000  1.00000
 -5         0.40999          host ceph02
 10    hdd  0.10699              osd.10       up   1.00000  1.00000
 11    hdd  0.10699              osd.11       up   1.00000  1.00000
  2    ssd  0.09799              osd.2        up   1.00000  1.00000
  3    ssd  0.09799              osd.3        up   1.00000  1.00000
-17         0.81999      rack rack11
 -7         0.40999          host ceph03
 12    hdd  0.10699              osd.12       up   1.00000  1.00000
 13    hdd  0.10699              osd.13       up   1.00000  1.00000
  4    ssd  0.09799              osd.4        up   1.00000  1.00000
  5    ssd  0.09799              osd.5        up   1.00000  1.00000
 -9         0.40999          host ceph04
 14    hdd  0.10699              osd.14       up   1.00000  1.00000
 15    hdd  0.10699              osd.15       up   1.00000  1.00000
  6    ssd  0.09799              osd.6        up   1.00000  1.00000
  7    ssd  0.09799              osd.7        up   1.00000  1.00000
root@ceph01:~# ceph osd out 1
marked out osd.1.
root@ceph01:~# ceph osd tree
ID   CLASS  WEIGHT   TYPE NAME            STATUS  REWEIGHT  PRI-AFF
 -1         1.63994  root default
-18         0.81995      rack rack10
 -3         0.40996          host ceph01
  8    hdd  0.10699              osd.8        up   1.00000  1.00000
  9    hdd  0.10699              osd.9        up   1.00000  1.00000
  0    ssd  0.09799              osd.0        up   1.00000  1.00000
  1    ssd  0.09798              osd.1        up         0  1.00000
 -5         0.40999          host ceph02
 10    hdd  0.10699              osd.10       up   1.00000  1.00000
 11    hdd  0.10699              osd.11       up   1.00000  1.00000
  2    ssd  0.09799              osd.2        up   1.00000  1.00000
  3    ssd  0.09799              osd.3        up   1.00000  1.00000
-17         0.81999      rack rack11
 -7         0.40999          host ceph03
 12    hdd  0.10699              osd.12       up   1.00000  1.00000
 13    hdd  0.10699              osd.13       up   1.00000  1.00000
  4    ssd  0.09799              osd.4        up   1.00000  1.00000
  5    ssd  0.09799              osd.5        up   1.00000  1.00000
 -9         0.40999          host ceph04
 14    hdd  0.10699              osd.14       up   1.00000  1.00000
 15    hdd  0.10699              osd.15       up   1.00000  1.00000
  6    ssd  0.09799              osd.6        up   1.00000  1.00000
  7    ssd  0.09799              osd.7        up   1.00000  1.00000
root@ceph01:~# ceph orch osd rm 1
Scheduled OSD(s) for removal

2020-09-10T16:29:58.176991+0200 mgr.ceph02.ouelws [INF] Removing daemon osd.1 from ceph01 2020-09-10T16:30:00.148659+0200 mgr.ceph02.ouelws [INF] Successfully removed OSD <1> on ceph01

root@ceph01:~# ceph osd tree
ID   CLASS  WEIGHT   TYPE NAME            STATUS  REWEIGHT  PRI-AFF
 -1         1.54196  root default
-18         0.72197      rack rack10
 -3         0.31198          host ceph01
  8    hdd  0.10699              osd.8        up   1.00000  1.00000
  9    hdd  0.10699              osd.9        up   1.00000  1.00000
  0    ssd  0.09799              osd.0        up   1.00000  1.00000
 -5         0.40999          host ceph02
 10    hdd  0.10699              osd.10       up   1.00000  1.00000
 11    hdd  0.10699              osd.11       up   1.00000  1.00000
  2    ssd  0.09799              osd.2        up   1.00000  1.00000
  3    ssd  0.09799              osd.3        up   1.00000  1.00000
-17         0.81999      rack rack11
 -7         0.40999          host ceph03
 12    hdd  0.10699              osd.12       up   1.00000  1.00000
 13    hdd  0.10699              osd.13       up   1.00000  1.00000
  4    ssd  0.09799              osd.4        up   1.00000  1.00000
  5    ssd  0.09799              osd.5        up   1.00000  1.00000
 -9         0.40999          host ceph04
 14    hdd  0.10699              osd.14       up   1.00000  1.00000
 15    hdd  0.10699              osd.15       up   1.00000  1.00000
  6    ssd  0.09799              osd.6        up   1.00000  1.00000
  7    ssd  0.09799              osd.7        up   1.00000  1.00000
root@ceph01:~# ceph orch device zap ceph01 /dev/sdc --force
INFO:cephadm:/usr/bin/docker:stderr --> Zapping: /dev/sdc
INFO:cephadm:/usr/bin/docker:stderr --> Zapping lvm member /dev/sdc. lv_path is /dev/ceph-0d19a151-30b6-459e-936a-488f143e11f6/osd-block-d5062900-abe7-413a-9d9a-d1cdda2948eb INFO:cephadm:/usr/bin/docker:stderr Running command: /usr/bin/dd if=/dev/zero of=/dev/ceph-0d19a151-30b6-459e-936a-488f143e11f6/osd-block-d5062900-abe7-413a-9d9a-d1cdda2948eb bs=1M count=10 conv=fsync
INFO:cephadm:/usr/bin/docker:stderr  stderr: 10+0 records in
INFO:cephadm:/usr/bin/docker:stderr 10+0 records out
INFO:cephadm:/usr/bin/docker:stderr stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0583658 s, 180 MB/s INFO:cephadm:/usr/bin/docker:stderr --> Only 1 LV left in VG, will proceed to destroy volume group ceph-0d19a151-30b6-459e-936a-488f143e11f6 INFO:cephadm:/usr/bin/docker:stderr Running command: /usr/sbin/vgremove -v -f ceph-0d19a151-30b6-459e-936a-488f143e11f6 INFO:cephadm:/usr/bin/docker:stderr stderr: Removing ceph--0d19a151--30b6--459e--936a--488f143e11f6-osd--block--d5062900--abe7--413a--9d9a--d1cdda2948eb (253:3) INFO:cephadm:/usr/bin/docker:stderr stderr: Archiving volume group "ceph-0d19a151-30b6-459e-936a-488f143e11f6" metadata (seqno 5). INFO:cephadm:/usr/bin/docker:stderr Releasing logical volume "osd-block-d5062900-abe7-413a-9d9a-d1cdda2948eb" INFO:cephadm:/usr/bin/docker:stderr stderr: Creating volume group backup "/etc/lvm/backup/ceph-0d19a151-30b6-459e-936a-488f143e11f6" (seqno 6). INFO:cephadm:/usr/bin/docker:stderr stdout: Logical volume "osd-block-d5062900-abe7-413a-9d9a-d1cdda2948eb" successfully removed INFO:cephadm:/usr/bin/docker:stderr stderr: Removing physical volume "/dev/sdc" from volume group "ceph-0d19a151-30b6-459e-936a-488f143e11f6" INFO:cephadm:/usr/bin/docker:stderr stdout: Volume group "ceph-0d19a151-30b6-459e-936a-488f143e11f6" successfully removed INFO:cephadm:/usr/bin/docker:stderr Running command: /usr/bin/dd if=/dev/zero of=/dev/sdc bs=1M count=10 conv=fsync
INFO:cephadm:/usr/bin/docker:stderr  stderr: 10+0 records in
INFO:cephadm:/usr/bin/docker:stderr 10+0 records out
INFO:cephadm:/usr/bin/docker:stderr stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.016043 s, 654 MB/s INFO:cephadm:/usr/bin/docker:stderr --> Zapping successful for: <Raw Device: /dev/sdc>

2020-09-10T16:31:15.951617+0200 mgr.ceph02.ouelws [INF] Zap device ceph01:/dev/sdc 2020-09-10T16:31:24.738974+0200 mgr.ceph02.ouelws [INF] Found osd claims for drivegroup SSD_OSDs -> {} 2020-09-10T16:31:24.740489+0200 mgr.ceph02.ouelws [INF] Applying SSD_OSDs on host ceph01... 2020-09-10T16:31:31.549897+0200 mgr.ceph02.ouelws [INF] Deploying daemon osd.1 on ceph01 2020-09-10T16:31:33.057061+0200 mgr.ceph02.ouelws [INF] Applying SSD_OSDs on host ceph02... 2020-09-10T16:31:33.057373+0200 mgr.ceph02.ouelws [INF] Applying SSD_OSDs on host ceph03... 2020-09-10T16:31:33.057519+0200 mgr.ceph02.ouelws [INF] Applying SSD_OSDs on host ceph04... 2020-09-10T16:31:37.569914+0200 mon.ceph01 [INF] osd.1 [v2:10.24.4.128:6810/4173467371,v1:10.24.4.128:6811/4173467371] boot
2020-09-10T16:31:46.531544+0200 mon.ceph01 [INF] Cluster is now healthy

root@ceph01:~# ceph osd tree
ID   CLASS  WEIGHT   TYPE NAME            STATUS  REWEIGHT  PRI-AFF
 -1         1.54196  root default
-18         0.72197      rack rack10
 -3         0.31198          host ceph01
  8    hdd  0.10699              osd.8        up   1.00000  1.00000
  9    hdd  0.10699              osd.9        up   1.00000  1.00000
  0    ssd  0.09799              osd.0        up   1.00000  1.00000
  1    ssd        0              osd.1        up   1.00000  1.00000
 -5         0.40999          host ceph02
 10    hdd  0.10699              osd.10       up   1.00000  1.00000
 11    hdd  0.10699              osd.11       up   1.00000  1.00000
  2    ssd  0.09799              osd.2        up   1.00000  1.00000
  3    ssd  0.09799              osd.3        up   1.00000  1.00000
-17         0.81999      rack rack11
 -7         0.40999          host ceph03
 12    hdd  0.10699              osd.12       up   1.00000  1.00000
 13    hdd  0.10699              osd.13       up   1.00000  1.00000
  4    ssd  0.09799              osd.4        up   1.00000  1.00000
  5    ssd  0.09799              osd.5        up   1.00000  1.00000
 -9         0.40999          host ceph04
 14    hdd  0.10699              osd.14       up   1.00000  1.00000
 15    hdd  0.10699              osd.15       up   1.00000  1.00000
  6    ssd  0.09799              osd.6        up   1.00000  1.00000
  7    ssd  0.09799              osd.7        up   1.00000  1.00000

Why does osd.1 have a weight of 0 now?

When the OSDs had been initially deployed with the first ceph orch apply command the weights have been correctly set according to their size. Why is there a difference between this process and an OSD (re-)deployed later on?

Regards
--
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux