Re: osd not removed from crush map after ceph osd crush remove

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dimitar

Is it fixed ?

- is your cluster pool size is 2
- you can consider running ceph pg repair {pgid}  or ceph osd lost 4 ( this is a bit dangerous command )

****************************************************************
Karan Singh 
Systems Specialist , Storage Platforms
CSC - IT Center for Science,
Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
mobile: +358 503 812758
tel. +358 9 4572001
fax +358 9 4572302
http://www.csc.fi/
****************************************************************

> On 22 Feb 2016, at 10:10, Dimitar Boichev <Dimitar.Boichev@xxxxxxxxxxxxx> wrote:
> 
> Anyone ?
>  
> Regards.
>  
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Dimitar Boichev
> Sent: Thursday, February 18, 2016 5:06 PM
> To: ceph-users@xxxxxxxxxxxxxx
> Subject:  osd not removed from crush map after ceph osd crush remove
>  
> Hello,
> I am running a tiny cluster of 2 nodes.
> ceph -v
> ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)
>  
> One osd died and I added a new osd (not replacing the old one).
> After that I wanted to remove the failed osd completely from the cluster.
> Here is what I did:
> ceph osd reweight osd.4 0.0
> ceph osd crush reweight osd.4 0.0
> ceph osd out osd.4
> ceph osd crush remove osd.4
> ceph auth del osd.4
> ceph osd rm osd.4
>  
>  
> But after the rebalancing I ended up with 155 PGs in stale+active+clean  state.
>  
> @storage1:/tmp# ceph -s
>     cluster 7a9120b9-df42-4308-b7b1-e1f3d0f1e7b3
>      health HEALTH_WARN 155 pgs stale; 155 pgs stuck stale; 1 requests are blocked > 32 sec; nodeep-scrub flag(s) set
>      monmap e1: 1 mons at {storage1=192.168.10.3:6789/0}, election epoch 1, quorum 0 storage1
>      osdmap e1064: 6 osds: 6 up, 6 in
>             flags nodeep-scrub
>       pgmap v26760322: 712 pgs, 8 pools, 532 GB data, 155 kobjects
>             1209 GB used, 14210 GB / 15419 GB avail
>                  155 stale+active+clean
>                  557 active+clean
>   client io 91925 B/s wr, 5 op/s
>  
> I know about the 1 monitor problem I just want to fix the cluster to healthy state then I will add the third storage node and go up to 3 monitors.
>  
> The problem is as follows:
> @storage1:/tmp# ceph pg map 2.3a
> osdmap e1064 pg 2.3a (2.3a) -> up [6] acting [6]
> @storage1:/tmp# ceph pg 2.3a query
> Error ENOENT: i don't have pgid 2.3a
>  
>  
> @storage1:/tmp# ceph health detail
> HEALTH_WARN 155 pgs stale; 155 pgs stuck stale; 1 requests are blocked > 32 sec; 1 osds have slow requests; nodeep-scrub flag(s) set
> pg 7.2a is stuck stale for 8887559.656879, current state stale+active+clean, last acting [4]
> pg 5.28 is stuck stale for 8887559.656886, current state stale+active+clean, last acting [4]
> pg 7.2b is stuck stale for 8887559.656889, current state stale+active+clean, last acting [4]
> pg 7.2c is stuck stale for 8887559.656892, current state stale+active+clean, last acting [4]
> pg 0.2b is stuck stale for 8887559.656893, current state stale+active+clean, last acting [4]
> pg 6.2c is stuck stale for 8887559.656894, current state stale+active+clean, last acting [4]
> pg 6.2f is stuck stale for 8887559.656893, current state stale+active+clean, last acting [4]
> pg 2.2b is stuck stale for 8887559.656896, current state stale+active+clean, last acting [4]
> pg 2.25 is stuck stale for 8887559.656896, current state stale+active+clean, last acting [4]
> pg 6.20 is stuck stale for 8887559.656898, current state stale+active+clean, last acting [4]
> pg 5.21 is stuck stale for 8887559.656898, current state stale+active+clean, last acting [4]
> pg 0.24 is stuck stale for 8887559.656904, current state stale+active+clean, last acting [4]
> pg 2.21 is stuck stale for 8887559.656904, current state stale+active+clean, last acting [4]
> pg 5.27 is stuck stale for 8887559.656906, current state stale+active+clean, last acting [4]
> pg 2.23 is stuck stale for 8887559.656908, current state stale+active+clean, last acting [4]
> pg 6.26 is stuck stale for 8887559.656909, current state stale+active+clean, last acting [4]
> pg 7.27 is stuck stale for 8887559.656913, current state stale+active+clean, last acting [4]
> pg 7.18 is stuck stale for 8887559.656914, current state stale+active+clean, last acting [4]
> pg 0.1e is stuck stale for 8887559.656914, current state stale+active+clean, last acting [4]
> pg 6.18 is stuck stale for 8887559.656919, current state stale+active+clean, last acting [4]
> pg 2.1f is stuck stale for 8887559.656919, current state stale+active+clean, last acting [4]
> pg 7.1b is stuck stale for 8887559.656922, current state stale+active+clean, last acting [4]
> pg 0.1b is stuck stale for 8887559.656919, current state stale+active+clean, last acting [4]
> pg 6.1d is stuck stale for 8887559.656925, current state stale+active+clean, last acting [4]
> pg 2.18 is stuck stale for 8887559.656920, current state stale+active+clean, last acting [4]
> pg 7.1d is stuck stale for 8887559.656926, current state stale+active+clean, last acting [4]
> pg 5.1c is stuck stale for 8887559.656921, current state stale+active+clean, last acting [4]
> pg 5.1d is stuck stale for 8887559.656920, current state stale+active+clean, last acting [4]
> pg 6.11 is stuck stale for 8887559.656922, current state stale+active+clean, last acting [4]
> pg 5.13 is stuck stale for 8887559.656919, current state stale+active+clean, last acting [4]
> pg 0.16 is stuck stale for 8887559.656924, current state stale+active+clean, last acting [4]
> pg 6.10 is stuck stale for 8887559.656928, current state stale+active+clean, last acting [4]
> pg 2.17 is stuck stale for 8887559.656927, current state stale+active+clean, last acting [4]
> pg 7.12 is stuck stale for 8887559.656932, current state stale+active+clean, last acting [4]
> pg 0.12 is stuck stale for 8887559.656929, current state stale+active+clean, last acting [4]
> pg 6.14 is stuck stale for 8887559.656935, current state stale+active+clean, last acting [4]
> pg 0.11 is stuck stale for 8887559.656932, current state stale+active+clean, last acting [4]
> pg 7.16 is stuck stale for 8887559.656936, current state stale+active+clean, last acting [4]
> pg 0.10 is stuck stale for 8887559.656936, current state stale+active+clean, last acting [4]
> pg 2.d is stuck stale for 8887559.656933, current state stale+active+clean, last acting [4]
> pg 6.9 is stuck stale for 8887559.656939, current state stale+active+clean, last acting [4]
> pg 7.9 is stuck stale for 8887559.656939, current state stale+active+clean, last acting [4]
> pg 0.d is stuck stale for 8887559.656940, current state stale+active+clean, last acting [4]
> pg 7.a is stuck stale for 8887559.656944, current state stale+active+clean, last acting [4]
> pg 0.c is stuck stale for 8887559.656941, current state stale+active+clean, last acting [4]
> pg 2.e is stuck stale for 8887559.656947, current state stale+active+clean, last acting [4]
> pg 6.a is stuck stale for 8887559.656953, current state stale+active+clean, last acting [4]
> pg 0.b is stuck stale for 8887559.656949, current state stale+active+clean, last acting [4]
> pg 2.9 is stuck stale for 8887559.656954, current state stale+active+clean, last acting [4]
> pg 5.f is stuck stale for 8887559.656953, current state stale+active+clean, last acting [4]
> pg 7.d is stuck stale for 8887559.656958, current state stale+active+clean, last acting [4]
> pg 6.f is stuck stale for 8887559.656957, current state stale+active+clean, last acting [4]
> pg 3.4 is stuck stale for 8887559.656957, current state stale+active+clean, last acting [4]
> pg 5.3 is stuck stale for 8887559.656956, current state stale+active+clean, last acting [4]
> pg 2.4 is stuck stale for 8887559.656961, current state stale+active+clean, last acting [4]
> pg 6.0 is stuck stale for 8887559.656966, current state stale+active+clean, last acting [4]
> pg 3.6 is stuck stale for 8887559.656965, current state stale+active+clean, last acting [4]
> pg 3.7 is stuck stale for 8887559.656964, current state stale+active+clean, last acting [4]
> pg 2.6 is stuck stale for 8887559.656970, current state stale+active+clean, last acting [4]
> pg 0.3 is stuck stale for 8887559.656965, current state stale+active+clean, last acting [4]
> pg 5.6 is stuck stale for 8887559.656970, current state stale+active+clean, last acting [4]
> pg 7.4 is stuck stale for 8887559.656975, current state stale+active+clean, last acting [4]
> pg 3.1 is stuck stale for 8887559.656970, current state stale+active+clean, last acting [4]
> pg 6.4 is stuck stale for 8887559.656975, current state stale+active+clean, last acting [4]
> pg 5.4 is stuck stale for 8887559.656972, current state stale+active+clean, last acting [4]
> pg 2.3 is stuck stale for 8887559.656977, current state stale+active+clean, last acting [4]
> pg 5.5 is stuck stale for 8887559.656977, current state stale+active+clean, last acting [4]
> pg 3.3 is stuck stale for 8887559.656982, current state stale+active+clean, last acting [4]
> pg 5.7a is stuck stale for 8887559.657309, current state stale+active+clean, last acting [4]
> pg 6.78 is stuck stale for 8887559.657308, current state stale+active+clean, last acting [4]
> pg 5.78 is stuck stale for 8887559.657311, current state stale+active+clean, last acting [4]
> pg 5.79 is stuck stale for 8887559.657311, current state stale+active+clean, last acting [4]
> pg 6.7c is stuck stale for 8887559.657313, current state stale+active+clean, last acting [4]
> pg 7.7e is stuck stale for 8887559.657312, current state stale+active+clean, last acting [4]
> pg 6.7e is stuck stale for 8887559.657315, current state stale+active+clean, last acting [4]
> pg 7.70 is stuck stale for 8887559.657316, current state stale+active+clean, last acting [4]
> pg 6.73 is stuck stale for 8887559.657316, current state stale+active+clean, last acting [4]
> pg 5.77 is stuck stale for 8887559.657317, current state stale+active+clean, last acting [4]
> pg 5.74 is stuck stale for 8887559.657319, current state stale+active+clean, last acting [4]
> pg 5.75 is stuck stale for 8887559.657321, current state stale+active+clean, last acting [4]
> pg 7.68 is stuck stale for 8887559.657322, current state stale+active+clean, last acting [4]
> pg 6.68 is stuck stale for 8887559.657324, current state stale+active+clean, last acting [4]
> pg 7.6b is stuck stale for 8887559.657326, current state stale+active+clean, last acting [4]
> pg 6.6d is stuck stale for 8887559.657328, current state stale+active+clean, last acting [4]
> pg 5.6e is stuck stale for 8887559.657330, current state stale+active+clean, last acting [4]
> pg 6.6c is stuck stale for 8887559.657330, current state stale+active+clean, last acting [4]
> pg 7.6f is stuck stale for 8887559.657331, current state stale+active+clean, last acting [4]
> pg 7.60 is stuck stale for 8887559.657333, current state stale+active+clean, last acting [4]
> pg 6.60 is stuck stale for 8887559.657333, current state stale+active+clean, last acting [4]
> pg 7.62 is stuck stale for 8887559.657334, current state stale+active+clean, last acting [4]
> pg 6.65 is stuck stale for 8887559.657334, current state stale+active+clean, last acting [4]
> pg 7.64 is stuck stale for 8887559.657339, current state stale+active+clean, last acting [4]
> pg 5.67 is stuck stale for 8887559.657338, current state stale+active+clean, last acting [4]
> pg 7.66 is stuck stale for 8887559.657340, current state stale+active+clean, last acting [4]
> pg 6.66 is stuck stale for 8887559.657340, current state stale+active+clean, last acting [4]
> pg 7.67 is stuck stale for 8887559.657345, current state stale+active+clean, last acting [4]
> pg 6.59 is stuck stale for 8887559.657344, current state stale+active+clean, last acting [4]
> pg 7.58 is stuck stale for 8887559.657348, current state stale+active+clean, last acting [4]
> pg 6.58 is stuck stale for 8887559.657348, current state stale+active+clean, last acting [4]
> pg 7.59 is stuck stale for 8887559.657352, current state stale+active+clean, last acting [4]
> pg 6.5b is stuck stale for 8887559.657353, current state stale+active+clean, last acting [4]
> pg 5.59 is stuck stale for 8887559.657348, current state stale+active+clean, last acting [4]
> pg 6.5a is stuck stale for 8887559.657356, current state stale+active+clean, last acting [4]
> pg 5.5e is stuck stale for 8887559.657352, current state stale+active+clean, last acting [4]
> pg 6.5d is stuck stale for 8887559.657358, current state stale+active+clean, last acting [4]
> pg 6.5f is stuck stale for 8887559.657356, current state stale+active+clean, last acting [4]
> pg 7.51 is stuck stale for 8887559.657356, current state stale+active+clean, last acting [4]
> pg 7.52 is stuck stale for 8887559.657356, current state stale+active+clean, last acting [4]
> pg 7.53 is stuck stale for 8887559.657358, current state stale+active+clean, last acting [4]
> pg 6.55 is stuck stale for 8887559.657359, current state stale+active+clean, last acting [4]
> pg 7.54 is stuck stale for 8887559.657364, current state stale+active+clean, last acting [4]
> pg 6.54 is stuck stale for 8887559.657364, current state stale+active+clean, last acting [4]
> pg 6.57 is stuck stale for 8887559.657365, current state stale+active+clean, last acting [4]
> pg 7.56 is stuck stale for 8887559.657369, current state stale+active+clean, last acting [4]
> pg 5.55 is stuck stale for 8887559.657371, current state stale+active+clean, last acting [4]
> pg 7.48 is stuck stale for 8887559.657372, current state stale+active+clean, last acting [4]
> pg 6.49 is stuck stale for 8887559.657375, current state stale+active+clean, last acting [4]
> pg 5.4a is stuck stale for 8887559.657376, current state stale+active+clean, last acting [4]
> pg 6.48 is stuck stale for 8887559.657379, current state stale+active+clean, last acting [4]
> pg 7.4a is stuck stale for 8887559.657380, current state stale+active+clean, last acting [4]
> pg 6.4a is stuck stale for 8887559.657383, current state stale+active+clean, last acting [4]
> pg 6.4d is stuck stale for 8887559.657385, current state stale+active+clean, last acting [4]
> pg 7.4d is stuck stale for 8887559.657387, current state stale+active+clean, last acting [4]
> pg 6.4c is stuck stale for 8887559.657389, current state stale+active+clean, last acting [4]
> pg 6.4e is stuck stale for 8887559.657391, current state stale+active+clean, last acting [4]
> pg 5.42 is stuck stale for 8887559.657391, current state stale+active+clean, last acting [4]
> pg 6.43 is stuck stale for 8887559.657393, current state stale+active+clean, last acting [4]
> pg 5.41 is stuck stale for 8887559.657393, current state stale+active+clean, last acting [4]
> pg 5.47 is stuck stale for 8887559.657394, current state stale+active+clean, last acting [4]
> pg 7.46 is stuck stale for 8887559.657396, current state stale+active+clean, last acting [4]
> pg 6.39 is stuck stale for 8887559.657398, current state stale+active+clean, last acting [4]
> pg 5.3a is stuck stale for 8887559.657399, current state stale+active+clean, last acting [4]
> pg 2.3e is stuck stale for 8887559.657399, current state stale+active+clean, last acting [4]
> pg 0.3c is stuck stale for 8887559.657402, current state stale+active+clean, last acting [4]
> pg 7.3c is stuck stale for 8887559.657404, current state stale+active+clean, last acting [4]
> pg 7.3d is stuck stale for 8887559.657405, current state stale+active+clean, last acting [4]
> pg 0.39 is stuck stale for 8887559.657402, current state stale+active+clean, last acting [4]
> pg 5.3c is stuck stale for 8887559.657405, current state stale+active+clean, last acting [4]
> pg 2.3a is stuck stale for 8887559.657406, current state stale+active+clean, last acting [4]
> pg 0.38 is stuck stale for 8887559.657409, current state stale+active+clean, last acting [4]
> pg 2.35 is stuck stale for 8887559.657411, current state stale+active+clean, last acting [4]
> pg 0.37 is stuck stale for 8887559.657412, current state stale+active+clean, last acting [4]
> pg 5.32 is stuck stale for 8887559.657413, current state stale+active+clean, last acting [4]
> pg 2.34 is stuck stale for 8887559.657416, current state stale+active+clean, last acting [4]
> pg 0.36 is stuck stale for 8887559.657416, current state stale+active+clean, last acting [4]
> pg 7.32 is stuck stale for 8887559.657419, current state stale+active+clean, last acting [4]
> pg 6.33 is stuck stale for 8887559.657420, current state stale+active+clean, last acting [4]
> pg 0.35 is stuck stale for 8887559.657423, current state stale+active+clean, last acting [4]
> pg 6.35 is stuck stale for 8887559.657423, current state stale+active+clean, last acting [4]
> pg 5.36 is stuck stale for 8887559.657424, current state stale+active+clean, last acting [4]
> pg 2.30 is stuck stale for 8887559.657427, current state stale+active+clean, last acting [4]
> pg 5.37 is stuck stale for 8887559.657429, current state stale+active+clean, last acting [4]
> pg 7.36 is stuck stale for 8887559.657430, current state stale+active+clean, last acting [4]
> pg 6.37 is stuck stale for 8887559.657432, current state stale+active+clean, last acting [4]
> pg 6.28 is stuck stale for 8887559.657427, current state stale+active+clean, last acting [4]
>  
>  
> This stays that way and I think this is because when I downloaded and decompiled the crush map I discovered this:
> @storage1:/tmp# crushtool -d /tmp/crushmap
> # begin crush map
> tunable choose_local_tries 0
> tunable choose_local_fallback_tries 0
> tunable choose_total_tries 50
> tunable chooseleaf_descend_once 1
>  
> # devices
> device 0 osd.0
> device 1 osd.1
> device 2 osd.2
> device 3 osd.3
> device 4 device4
> device 5 osd.5
> device 6 osd.6
>  
>  
>  
> Is there a way to remove this device 4 aka osd.4 from here so ceph can make another copy from the other location shown in “ceph pg map 2.3a”  ?
>  
> Regards.
>  
> Dimitar Boichev
> SysAdmin Team Lead
> AXSMarine Sofia
> Phone: +359 889 22 55 42
> Skype: dimitar.boichev.axsmarine
> E-mail: dimitar.boichev@xxxxxxxxxxxxx
>  
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux