Hello, After a major network outage our ceph cluster ended up with an inactive PG: # ceph health detail HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 1 pgs stuck unclean; 1 requests are blocked > 32 sec; 1 osds have slow requests pg 3.367 is stuck inactive for 912263.766607, current state incomplete, last acting [28,35,2] pg 3.367 is stuck unclean for 912263.766688, current state incomplete, last acting [28,35,2] pg 3.367 is incomplete, acting [28,35,2] 1 ops are blocked > 268435 sec 1 ops are blocked > 268435 sec on osd.28 1 osds have slow requests # ceph -s cluster 6713d1b8-83da-11e6-aa79-525400d98c5a health HEALTH_WARN 1 pgs incomplete 1 pgs stuck inactive 1 pgs stuck unclean 1 requests are blocked > 32 sec monmap e3: 3 mons at {tv-dl360-1=10.12.193.73:6789/0,tv-dl360-2=10.12.193.74:6789/0,tv-dl360-3=10.12.193.75:6789/0} election epoch 72, quorum 0,1,2 tv-dl360-1,tv-dl360-2,tv-dl360-3 osdmap e60609: 72 osds: 72 up, 72 in pgmap v3670252: 4864 pgs, 11 pools, 134 GB data, 23778 objects 490 GB used, 130 TB / 130 TB avail 4863 active+clean 1 incomplete client io 0 B/s rd, 38465 B/s wr, 2 op/s ceph pg repair doesn't change anything. What should I try to recover it? Attached is the result of ceph pg query on the problem PG. Thank you, Laszlo
Attachment:
pg_3.367_query.gz
Description: application/gzip
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com