Re: pgs stuck inactive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Can you explain more about what happened?

The query shows progress is blocked by the following OSDs.

                "blocked_by": [
                    14,
                    17,
                    51,
                    58,
                    63,
                    64,
                    68,
                    70
                ],

Some of these OSDs are marked as "dne" (Does Not Exist).

peer": "17",
"dne": 1,
"peer": "51",
"dne": 1,
"peer": "58",
"dne": 1,
"peer": "64",
"dne": 1,
"peer": "70",
"dne": 1,

Can we get a complete background here please?


On Thu, Mar 9, 2017 at 10:53 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx> wrote:
> Hello,
>
> After a major network outage our ceph cluster ended up with an inactive PG:
>
> # ceph health detail
> HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 1 pgs stuck unclean; 1
> requests are blocked > 32 sec; 1 osds have slow requests
> pg 3.367 is stuck inactive for 912263.766607, current state incomplete, last
> acting [28,35,2]
> pg 3.367 is stuck unclean for 912263.766688, current state incomplete, last
> acting [28,35,2]
> pg 3.367 is incomplete, acting [28,35,2]
> 1 ops are blocked > 268435 sec
> 1 ops are blocked > 268435 sec on osd.28
> 1 osds have slow requests
>
> # ceph -s
>     cluster 6713d1b8-83da-11e6-aa79-525400d98c5a
>      health HEALTH_WARN
>             1 pgs incomplete
>             1 pgs stuck inactive
>             1 pgs stuck unclean
>             1 requests are blocked > 32 sec
>      monmap e3: 3 mons at
> {tv-dl360-1=10.12.193.73:6789/0,tv-dl360-2=10.12.193.74:6789/0,tv-dl360-3=10.12.193.75:6789/0}
>             election epoch 72, quorum 0,1,2 tv-dl360-1,tv-dl360-2,tv-dl360-3
>      osdmap e60609: 72 osds: 72 up, 72 in
>       pgmap v3670252: 4864 pgs, 11 pools, 134 GB data, 23778 objects
>             490 GB used, 130 TB / 130 TB avail
>                 4863 active+clean
>                    1 incomplete
>   client io 0 B/s rd, 38465 B/s wr, 2 op/s
>
> ceph pg repair doesn't change anything. What should I try to recover it?
> Attached is the result of ceph pg query on the problem PG.
>
> Thank you,
> Laszlo
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux