Re: All replicas of pg 5.b got placed on the same host - how to correct?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks a lot - problem fixed.


On 10.10.2017 16:58, Peter Linder wrote:
I think your failure domain within your rules is wrong.

step choose firstn 0 type osd

Should be:

step choose firstn 0 type host


On 10/10/2017 5:05 PM, Konrad Riedel wrote:
Hello Ceph-users,

after switching to luminous I was excited about the great
crush-device-class feature - now we have 5 servers with 1x2TB NVMe
based OSDs, 3 of them additionally with 4 HDDS per server. (we have
only three 400G NVMe disks for block.wal and block.db and therefore
can't distribute all HDDs evenly on all servers.)

Output from "ceph pg dump" shows that some PGs end up on HDD OSDs on
the same
Host:

ceph pg map 5.b
osdmap e12912 pg 5.b (5.b) -> up [9,7,8] acting [9,7,8]

(on rebooting this host I had 4 stale PGs)

I've written a small perl script to add hostname after OSD number and
got many PGs where
ceph placed 2 replicas on the same host... :

5.1e7: 8 - daniel 9 - daniel 11 - udo
5.1eb: 10 - udo 7 - daniel 9 - daniel
5.1ec: 10 - udo 11 - udo 7 - daniel
5.1ed: 13 - felix 16 - felix 5 - udo


Is there any way I can correct this?


Please see crushmap below. Thanks for any help!

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class hdd
device 1 device1
device 2 osd.2 class ssd
device 3 device3
device 4 device4
device 5 osd.5 class hdd
device 6 device6
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
device 13 osd.13 class hdd
device 14 osd.14 class hdd
device 15 device15
device 16 osd.16 class hdd
device 17 device17
device 18 device18
device 19 device19
device 20 device20
device 21 device21
device 22 device22
device 23 device23
device 24 osd.24 class hdd
device 25 device25
device 26 osd.26 class hdd
device 27 osd.27 class hdd
device 28 osd.28 class hdd
device 29 osd.29 class hdd
device 30 osd.30 class ssd
device 31 osd.31 class ssd
device 32 osd.32 class ssd
device 33 osd.33 class ssd

# types
type 0 osd
type 1 host
type 2 rack
type 3 row
type 4 room
type 5 datacenter
type 6 root

# buckets
host daniel {
     id -4        # do not change unnecessarily
     id -2 class hdd        # do not change unnecessarily
     id -9 class ssd        # do not change unnecessarily
     # weight 3.459
     alg straw2
     hash 0    # rjenkins1
     item osd.31 weight 1.819
     item osd.7 weight 0.547
     item osd.8 weight 0.547
     item osd.9 weight 0.547
}
host felix {
     id -5        # do not change unnecessarily
     id -3 class hdd        # do not change unnecessarily
     id -10 class ssd        # do not change unnecessarily
     # weight 3.653
     alg straw2
     hash 0    # rjenkins1
     item osd.33 weight 1.819
     item osd.13 weight 0.547
     item osd.14 weight 0.467
     item osd.16 weight 0.547
     item osd.0 weight 0.274
}
host udo {
     id -6        # do not change unnecessarily
     id -7 class hdd        # do not change unnecessarily
     id -11 class ssd        # do not change unnecessarily
     # weight 4.006
     alg straw2
     hash 0    # rjenkins1
     item osd.32 weight 1.819
     item osd.5 weight 0.547
     item osd.10 weight 0.547
     item osd.11 weight 0.547
     item osd.12 weight 0.547
}
host moritz {
     id -13        # do not change unnecessarily
     id -14 class hdd        # do not change unnecessarily
     id -15 class ssd        # do not change unnecessarily
     # weight 1.819
     alg straw2
     hash 0    # rjenkins1
     item osd.30 weight 1.819
}
host bruno {
     id -16        # do not change unnecessarily
     id -17 class hdd        # do not change unnecessarily
     id -18 class ssd        # do not change unnecessarily
     # weight 3.183
     alg straw2
     hash 0    # rjenkins1
     item osd.24 weight 0.273
     item osd.26 weight 0.273
     item osd.27 weight 0.273
     item osd.28 weight 0.273
     item osd.29 weight 0.273
     item osd.2 weight 1.819
}
root default {
     id -1        # do not change unnecessarily
     id -8 class hdd        # do not change unnecessarily
     id -12 class ssd        # do not change unnecessarily
     # weight 16.121
     alg straw2
     hash 0    # rjenkins1
     item daniel weight 3.459
     item felix weight 3.653
     item udo weight 4.006
     item moritz weight 1.819
     item bruno weight 3.183
}

# rules
rule ssd {
     id 0
     type replicated
     min_size 1
     max_size 10
     step take default class ssd
     step choose firstn 0 type osd
     step emit
}
rule hdd {
     id 1
     type replicated
     min_size 1
     max_size 10
     step take default class hdd
     step choose firstn 0 type osd
     step emit
}

# end crush map


--

Mit freundlichen Grüßen

Konrad Riedel

--

Berufsförderungswerk Dresden gemeinnützige GmbH
SG1
IT/Infrastruktur
Hellerhofstraße 35
D-01129 Dresden
Telefon (03 51) 85 48 - 115
Telefax (03 51) 85 48 - 507
E-Mail   it@xxxxxxxxxxxxxx

--------------------------------------------------------------------
Vorsitzende des Verwaltungsrates: Dr. Ina Ueberschär
Geschäftsführer: Henry Köhler
Unternehmenssitz: Dresden
Handelsregister: Amtsgericht Dresden HRB 2380
Zertifiziert nach ISO 9001:2015 und AZAV
--------------------------------------------------------------------

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux