Re: Health_Warn recovery stuck / crushmap problem?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jonas,

In your current CRUSH map your root ssd contains 2 nodes but those two nodes contain no osds and this is causing the problem.

Look like you forgot to set the parameter osd_crush_update_on_start = false before applying your special CRUSH Map. Hence when you restarted the OSD they wen back the default behaviour of attaching themselves to the host they run on.

To get it back to healthy for now, set the parameter above in your ceph.conf on your OSD nodes, restart your OSDs then re-apply your customized CRUSH map.

As an alternative you can also use the CRUSH location hook to automate the placement of your OSDs (http://docs.ceph.com/docs/master/rados/operations/crush-map/#custom-location-hooks)

Regards
JC

On 24 Jan 2017, at 07:42, Jonas Stunkat <jonas.stunkat@xxxxxxxxxxx> wrote:

All OSD´s and Monitors are up from what I can see.
I read through the troubleshooting like mentioned in the ceph documentation for PGs and came to the conclusion that nothing there would help me, so I didn´t try anything - except restarting / rebooting OSD´s and Monitors.

How do I recover from this, it looks to me that the data itself should be safe for now, but why is it not restoring?
I guess the problem may be the crushmap.

Here are some outputs:

#ceph health detail

HEALTH_WARN 475 pgs degraded; 640 pgs stale; 475 pgs stuck degraded; 640 pgs stuck stale; 640 pgs stuck unclean; 475 pgs stuck undersized; 475 pgs undersized; recovery 104812/279550 objects degraded (37.493%); recovery 69926/279550 objects misplaced (25.014%)
pg 3.ec is stuck unclean for 3326815.935321, current state stale+active+remapped, last acting [7,6]
pg 3.ed is stuck unclean for 3288818.682456, current state stale+active+remapped, last acting [6,7]
pg 3.ee is stuck unclean for 409973.052061, current state stale+active+undersized+degraded, last acting [7]
pg 3.ef is stuck unclean for 3357894.554762, current state stale+active+undersized+degraded, last acting [7]
pg 3.e8 is stuck unclean for 384815.518837, current state stale+active+undersized+degraded, last acting [6]
pg 3.e9 is stuck unclean for 3274554.591000, current state stale+active+remapped, last acting [6,7]
......

################################################################################

This is the crushmap I created and intended to use and thought I used for the past 2 months:
- pvestorage1-ssd and pvestorage1-platter are the same hosts, it seems like this is not possible but I never noticed
- likewise with pvestorage2

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable straw_calc_version 1

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
device 6 osd.6
device 7 osd.7

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host pvestorage1-ssd {
        id -2   # do not change unnecessarily
        # weight 1.740
        alg straw
        hash 0  # rjenkins1
        item osd.0 weight 0.870
        item osd.1 weight 0.870
}
host pvestorage2-ssd {
        id -3   # do not change unnecessarily
        # weight 1.740
        alg straw
        hash 0  # rjenkins1
        item osd.2 weight 0.870
        item osd.3 weight 0.870
}
host pvestorage1-platter {
        id -4           # do not change unnecessarily
        # weight 4
        alg straw
        hash 0  # rjenkins1
        item osd.4 weight 2.000
        item osd.5 weight 2.000
}
host pvestorage2-platter {
        id -5           # do not change unnecessarily
        # weight 4
        alg straw
        hash 0  # rjenkins1
        item osd.6 weight 2.000
        item osd.7 weight 2.000
}

root ssd {
        id -1   # do not change unnecessarily
        # weight 3.480
        alg straw
        hash 0  # rjenkins1
        item pvestorage1-ssd weight 1.740
        item pvestorage2-ssd weight 1.740
}

root platter {
        id -6           # do not change unnecessarily
        # weight 8
        alg straw
        hash 0  # rjenkins1
        item pvestorage1-platter weight 4.000
        item pvestorage2-platter weight 4.000
}

# rules
rule ssd {
        ruleset 0
        type replicated
        min_size 1
        max_size 10
        step take ssd
        step chooseleaf firstn 0 type host
        step emit
}

rule platter {
        ruleset 1
        type replicated
        min_size 1
        max_size 10
        step take platter
        step chooseleaf firstn 0 type host
        step emit
}
# end crush map
################################################################################

This is the what ceph made of this crushmap and the one that is actually used right now, I never looked -_- :

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable straw_calc_version 1

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
device 6 osd.6
device 7 osd.7

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host pvestorage1-ssd {
        id -2   # do not change unnecessarily
        # weight 0.000
        alg straw
        hash 0  # rjenkins1
}
host pvestorage2-ssd {
        id -3   # do not change unnecessarily
        # weight 0.000
        alg straw
        hash 0  # rjenkins1
}
root ssd {
        id -1   # do not change unnecessarily
        # weight 0.000
        alg straw
        hash 0  # rjenkins1
        item pvestorage1-ssd weight 0.000
        item pvestorage2-ssd weight 0.000
}
host pvestorage1-platter {
        id -4   # do not change unnecessarily
        # weight 0.000
        alg straw
        hash 0  # rjenkins1
}
host pvestorage2-platter {
        id -5   # do not change unnecessarily
        # weight 0.000
        alg straw
        hash 0  # rjenkins1
}
root platter {
        id -6   # do not change unnecessarily
        # weight 0.000
        alg straw
        hash 0  # rjenkins1
        item pvestorage1-platter weight 0.000
        item pvestorage2-platter weight 0.000
}
host pvestorage1 {
        id -7   # do not change unnecessarily
        # weight 5.740
        alg straw
        hash 0  # rjenkins1
        item osd.5 weight 2.000
        item osd.4 weight 2.000
        item osd.1 weight 0.870
        item osd.0 weight 0.870
}
host pvestorage2 {
        id -9   # do not change unnecessarily
        # weight 5.740
        alg straw
        hash 0  # rjenkins1
        item osd.3 weight 0.870
        item osd.2 weight 0.870
        item osd.6 weight 2.000
        item osd.7 weight 2.000
}
root default {
        id -8   # do not change unnecessarily
        # weight 11.480
        alg straw
        hash 0  # rjenkins1
        item pvestorage1 weight 5.740
        item pvestorage2 weight 5.740
}

# rules
rule ssd {
        ruleset 0
        type replicated
        min_size 1
        max_size 10
        step take ssd
        step chooseleaf firstn 0 type host
        step emit
}
rule platter {
        ruleset 1
        type replicated
        min_size 1
        max_size 10
        step take platter
        step chooseleaf firstn 0 type host
        step emit
}

# end crush map
################################################################################

How do I recover from this?

Best Regards
Jonas
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux