Re: PG is in 'stuck unclean' state, but all acting OSD are up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Output of `ceph pg dump_stuck`

 

# ceph pg dump_stuck

ok

pg_stat state   up      up_primary      acting  acting_primary

4.2a8   down+peering    [79,8,74]       79      [79,8,74]       79

4.c3    down+peering    [56,79,67]      56      [56,79,67]      56

 

-Chris

 

From: Goncalo Borges <goncalo.borges@xxxxxxxxxxxxx>
Date: Monday, August 15, 2016 at 9:03 PM
To: "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx>, "Heller, Chris" <cheller@xxxxxxxxxx>
Subject: Re: [ceph-users] PG is in 'stuck unclean' state, but all acting OSD are up

 

Hi Heller...

Can you actually post the result of

   ceph pg dump_stuck ?

Cheers

G.

 

 

On 08/15/2016 10:19 PM, Heller, Chris wrote:

I’d like to better understand the current state of my CEPH cluster.

 

I currently have 2 PG that are in the ‘stuck unclean’ state:

 

# ceph health detail

HEALTH_WARN 2 pgs down; 2 pgs peering; 2 pgs stuck inactive; 2 pgs stuck unclean

pg 4.2a8 is stuck inactive for 124516.777791, current state down+peering, last acting [79,8,74]

pg 4.c3 is stuck inactive since forever, current state down+peering, last acting [56,79,67]

pg 4.2a8 is stuck unclean for 124536.223284, current state down+peering, last acting [79,8,74]

pg 4.c3 is stuck unclean since forever, current state down+peering, last acting [56,79,67]

pg 4.2a8 is down+peering, acting [79,8,74]

pg 4.c3 is down+peering, acting [56,79,67]

 

While my cluster does currently have some down OSD, none are in the acting set for either PG:

 

ceph osd tree | grep down

73   1.00000         osd.73              down        0          1.00000

96   1.00000         osd.96              down        0          1.00000

110   1.00000         osd.110             down        0          1.00000

116   1.00000         osd.116             down        0          1.00000

120   1.00000         osd.120             down        0          1.00000

126   1.00000         osd.126             down        0          1.00000

124   1.00000         osd.124             down        0          1.00000

119   1.00000         osd.119             down        0          1.00000

 

I’ve queried one of the two PG, and see that recovery is currently blocked on OSD.116, which is indeed down, but is not part of the acting set of OSD for that PG:

 

http://pastebin.com/Rg2hK9GE

 

This is all with CEPH version 0.94.3:

 

# ceph version

ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)

 

Why does this PG remain ‘stuck unclean’?

Is there some steps I can take to unstick it, given that all the acting OSD are up and in?

 

(* Re-sent, now that I’m subscribed to list *)

-Chris




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics A28 | University of Sydney, NSW  2006
T: +61 2 93511937
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux