Incomplete pg blocking reads

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

I've run into a problem with my cluster - 1 pg is incomplete, and that is blocking reads for a 100TiB RBD volume. (The VM actually halts execution, it's not like a normal I/O problem where the virtual controller times out and tries to reset the bus, etc.)

I've read several threads about the problem with incomplete pgs and have dug around quite a bit, but I suspect I don't quite know where to look for the information I need.

The pool holding this volume has a size of 2, with min_size of 2. I suspect the problem began when two osds, in separate hosts, failed within a short time of each other. (osd.2 and osd.21 in this case.) The physical disk for osd.2 is dead, but 21's disk seems okay and the XFS filesystem behind it doesn't have any problems that xfs_repair can find.

In attempt to resolve the issue, I've restarted the individual osds, the entire cluster, rebooted all the cluster hosts, and upgraded to the latest devel build to rule out having hit a known and fixed bug. Most recently, I tried marking osd 8 'out', since I believe 5 has the data that is missing. I can provide logfiles from the osds 5 and 8 with increased debugging if that'll help.

I can restart osd.21, and that makes the "incomplete" pg go away for a few minutes, but osd.21 crashes within 30 seconds of being started, and the incomplete pg comes back when that happens.

Any suggestions on how to troubleshoot this further would be helpful. I thought I could chronicle my attempts to date to avoid duplication of effort but I have tried too many things to clearly reconstruct the path.

Thanks in advance!


Here are some various stats that might help:

aaron@seven ~ $ ceph -v
ceph version 0.76 (3b990136bfab74249f166dd742fd8e61637e63d9)


aaron@seven ~ $ ceph pg stat
v7286520: 2200 pgs: 2048 active+clean, 78 active+remapped+wait_backfill, 72 active+remapped+backfilling, 1 incomplete, 1 active+clean+inconsistent; 21577 GB data, 41396 GB used, 27946 GB / 69343 GB avail; 461896/11663686 objects degraded (3.960%); 210 MB/s, 53 objects/s recovering


aaron@seven ~ $ ceph health detail
HEALTH_ERR 78 pgs backfill; 72 pgs backfilling; 1 pgs incomplete; 1 pgs inconsistent; 1 pgs stuck inactive; 151 pgs stuck unclean; recovery 459868/11663686 objects degraded (3.943%); 1 scrub errors; mds picard is laggy
pg 2.28b is stuck inactive since forever, current state incomplete, last acting [6,5]
pg 2.6b is stuck unclean for 1161.425227, current state active+remapped+backfilling, last acting [19,8,4]
pg 3.1e4 is stuck unclean for 888.661689, current state active+remapped+wait_backfill, last acting [8,20,22]
pg 2.1e5 is stuck unclean for 888.661578, current state active+remapped+wait_backfill, last acting [8,20,22]
pg 2.127 is stuck unclean for 889.325675, current state active+remapped+backfilling, last acting [5,8,11]
pg 2.63 is stuck unclean for 888.661689, current state active+remapped+wait_backfill, last acting [8,16,3]
pg 3.62 is stuck unclean for 888.661682, current state active+remapped+wait_backfill, last acting [8,16,3]
pg 2.1e0 is stuck unclean for 888.661675, current state active+remapped+wait_backfill, last acting [8,19,1]
pg 2.420 is stuck unclean for 888.662537, current state active+remapped+wait_backfill, last acting [8,17,1]
pg 3.1df is stuck unclean for 888.661671, current state active+remapped+wait_backfill, last acting [8,19,1]
pg 2.118 is stuck unclean for 888.662593, current state active+remapped+wait_backfill, last acting [8,19,3]
pg 2.29e is stuck unclean for 219581.715114, current state active+remapped+backfilling, last acting [14,8,7]
pg 3.114 is stuck unclean for 888.662518, current state active+remapped+wait_backfill, last acting [8,14,3]
pg 2.115 is stuck unclean for 888.662537, current state active+remapped+backfilling, last acting [8,14,3]
pg 2.57 is stuck unclean for 159935.373946, current state active+remapped+backfilling, last acting [6,8,0]
pg 3.56 is stuck unclean for 1643032.146070, current state active+remapped+wait_backfill, last acting [6,8,0]
pg 2.1d4 is stuck unclean for 888.661671, current state active+remapped+backfilling, last acting [8,13,3]
pg 3.117 is stuck unclean for 888.662503, current state active+remapped+wait_backfill, last acting [8,19,3]
pg 2.294 is stuck unclean for 888.662490, current state active+remapped+wait_backfill, last acting [8,17,20]
pg 2.41a is stuck unclean for 1161.455746, current state active+remapped+wait_backfill, last acting [20,8,0]
pg 2.1d0 is stuck unclean for 888.661674, current state active+remapped+wait_backfill, last acting [8,16,14]
pg 3.351 is stuck unclean for 888.661636, current state active+remapped+wait_backfill, last acting [8,11,17]
pg 3.293 is stuck unclean for 888.662448, current state active+remapped+wait_backfill, last acting [8,17,20]
pg 2.352 is stuck unclean for 888.661623, current state active+remapped+wait_backfill, last acting [8,11,17]
pg 3.1cf is stuck unclean for 888.661638, current state active+remapped+wait_backfill, last acting [8,16,14]
pg 2.40d is stuck unclean for 889.663590, current state active+remapped+backfilling, last acting [16,8,3]
pg 2.34f is stuck unclean for 888.661581, current state active+remapped+wait_backfill, last acting [8,18,9]
pg 3.34e is stuck unclean for 888.661644, current state active+remapped+wait_backfill, last acting [8,18,9]
pg 2.4a is stuck unclean for 146084.212967, current state active+remapped+backfilling, last acting [16,8,4]
pg 2.1cb is stuck unclean for 81410.959695, current state active+remapped+backfilling, last acting [19,8,9]
pg 2.1ca is stuck unclean for 160231.487694, current state active+remapped+backfilling, last acting [15,8,7]
pg 2.28b is stuck unclean since forever, current state incomplete, last acting [6,5]
pg 2.409 is stuck unclean for 160095.566719, current state active+remapped+backfilling, last acting [4,8,1]
pg 2.345 is stuck unclean for 219637.448262, current state active+remapped+backfilling, last acting [6,8,15]
pg 2.405 is stuck unclean for 160095.566704, current state active+remapped+backfilling, last acting [14,8,6]
pg 2.fc is stuck unclean for 888.662298, current state active+remapped+wait_backfill, last acting [8,20,3]
pg 2.342 is stuck unclean for 219394.652780, current state active+remapped+backfilling, last acting [6,8,22]
pg 2.1bf is stuck unclean for 889.325454, current state active+remapped+backfilling, last acting [11,8,7]
pg 2.1b8 is stuck unclean for 889.326218, current state active+remapped+backfilling, last acting [15,8,5]
pg 3.fb is stuck unclean for 888.662289, current state active+remapped+wait_backfill, last acting [8,20,3]
pg 2.338 is stuck unclean for 1161.424960, current state active+remapped+backfilling, last acting [19,8,13]
pg 2.3f9 is stuck unclean for 889.326078, current state active+remapped+backfilling, last acting [14,8,10]
pg 3.1b6 is stuck unclean for 888.661612, current state active+remapped+wait_backfill, last acting [8,17,0]
pg 2.1b7 is stuck unclean for 888.661625, current state active+remapped+wait_backfill, last acting [8,17,0]
pg 2.3f7 is stuck unclean for 888.662259, current state active+remapped+backfilling, last acting [8,11,0]
pg 2.270 is stuck unclean for 889.669800, current state active+remapped+backfilling, last acting [6,8,7]
pg 2.2c is stuck unclean for 888.661599, current state active+remapped+wait_backfill, last acting [8,15,9]
pg 3.2d is stuck unclean for 888.661593, current state active+remapped+wait_backfill, last acting [8,15,3]
pg 2.273 is stuck unclean for 1161.773478, current state active+remapped+backfilling, last acting [18,8,1]
pg 2.3f0 is stuck unclean for 85410.265493, current state active+remapped+backfilling, last acting [5,8,10]
pg 2.2e is stuck unclean for 888.661595, current state active+remapped+wait_backfill, last acting [8,15,3]
pg 3.2b is stuck unclean for 888.661580, current state active+remapped+wait_backfill, last acting [8,15,9]
pg 3.1a9 is stuck unclean for 888.661574, current state active+remapped+wait_backfill, last acting [8,18,7]
pg 2.3ef is stuck unclean for 1161.426416, current state active+remapped+backfilling, last acting [19,8,7]
pg 2.1aa is stuck unclean for 888.661661, current state active+remapped+wait_backfill, last acting [8,18,7]
pg 2.32b is stuck unclean for 156663.886748, current state active+remapped+backfilling, last acting [14,8,17]
pg 2.21 is stuck unclean for 1161.769888, current state active+remapped+backfilling, last acting [18,8,13]
pg 3.1c is stuck unclean for 888.661636, current state active+remapped+wait_backfill, last acting [8,16,0]
pg 2.1d is stuck unclean for 888.661648, current state active+remapped+wait_backfill, last acting [8,16,0]
pg 2.1c is stuck unclean for 888.661632, current state active+remapped+wait_backfill, last acting [8,20,7]
pg 2.323 is stuck unclean for 302501.945292, current state active+remapped+backfilling, last acting [13,8,10]
pg 2.4a0 is stuck unclean for 888.661565, current state active+remapped+wait_backfill, last acting [8,16,11]
pg 3.19 is stuck unclean for 888.661636, current state active+remapped+wait_backfill, last acting [8,13,0]
pg 3.1a is stuck unclean for 888.661619, current state active+remapped+wait_backfill, last acting [8,10,5]
pg 2.1b is stuck unclean for 888.661633, current state active+remapped+wait_backfill, last acting [8,10,5]
pg 2.25e is stuck unclean for 888.662278, current state active+remapped+backfilling, last acting [8,14,22]
pg 3.1b is stuck unclean for 888.661612, current state active+remapped+wait_backfill, last acting [8,20,7]
pg 2.1a is stuck unclean for 888.661626, current state active+remapped+wait_backfill, last acting [8,13,0]
pg 2.3d9 is stuck unclean for 1161.773372, current state active+remapped+backfilling, last acting [18,8,16]
pg 3.12 is stuck unclean for 888.661578, current state active+remapped+wait_backfill, last acting [8,17,19]
pg 2.13 is stuck unclean for 888.661591, current state active+remapped+wait_backfill, last acting [8,17,19]
pg 2.317 is stuck unclean for 159979.814667, current state active+remapped+backfilling, last acting [4,8,1]
pg 2.3d7 is stuck unclean for 889.327661, current state active+remapped+backfilling, last acting [11,8,0]
pg 2.30c is stuck unclean for 888.663342, current state active+remapped+backfilling, last acting [8,15,0]
pg 2.492 is stuck unclean for 889.664908, current state active+remapped+backfilling, last acting [16,8,22]
pg 2.b is stuck unclean for 888.663383, current state active+remapped+backfilling, last acting [8,11,22]
pg 2.30f is stuck unclean for 889.327852, current state active+remapped+backfilling, last acting [4,8,9]
pg 3.3cd is stuck unclean for 888.662282, current state active+remapped+wait_backfill, last acting [8,11,14]
pg 2.3ce is stuck unclean for 888.662337, current state active+remapped+wait_backfill, last acting [8,11,14]
pg 2.3c9 is stuck unclean for 888.662309, current state active+remapped+wait_backfill, last acting [8,19,0]
pg 3.3c8 is stuck unclean for 888.662329, current state active+remapped+wait_backfill, last acting [8,19,0]
pg 2.184 is stuck unclean for 209128.087843, current state active+remapped+backfilling, last acting [13,8,9]
pg 2.3cb is stuck unclean for 888.662278, current state active+remapped+wait_backfill, last acting [8,19,22]
pg 3.3ca is stuck unclean for 888.662310, current state active+remapped+wait_backfill, last acting [8,19,22]
pg 2.3ca is stuck unclean for 1161.455302, current state active+remapped+backfilling, last acting [20,8,15]
pg 2.c1 is stuck unclean for 888.662429, current state active+remapped+wait_backfill, last acting [8,15,13]
pg 3.c0 is stuck unclean for 888.662398, current state active+remapped+wait_backfill, last acting [8,15,13]
pg 2.2 is stuck unclean for 232941.977076, current state active+remapped+backfilling, last acting [6,8,3]
pg 2.183 is stuck unclean for 160014.724114, current state active+remapped+backfilling, last acting [5,8,1]
pg 3.241 is stuck unclean for 888.662292, current state active+remapped+wait_backfill, last acting [8,18,9]
pg 3.301 is stuck unclean for 888.663258, current state active+remapped+wait_backfill, last acting [8,19,9]
pg 2.242 is stuck unclean for 888.662284, current state active+remapped+wait_backfill, last acting [8,18,9]
pg 2.302 is stuck unclean for 888.663248, current state active+remapped+wait_backfill, last acting [8,19,9]
pg 2.23c is stuck unclean for 889.325420, current state active+remapped+backfilling, last acting [14,8,0]
pg 2.ba is stuck unclean for 889.327554, current state active+remapped+backfilling, last acting [12,8,1]
pg 2.175 is stuck unclean for 219770.886373, current state active+remapped+backfilling, last acting [4,8,9]
pg 2.3b8 is stuck unclean for 888.662229, current state active+remapped+wait_backfill, last acting [8,14,22]
pg 2.174 is stuck unclean for 888.663214, current state active+remapped+wait_backfill, last acting [8,15,16]
pg 2.478 is stuck unclean for 1161.774401, current state active+remapped+backfilling, last acting [18,8,1]
pg 2.2f5 is stuck unclean for 888.663146, current state active+remapped+backfilling, last acting [8,12,7]
pg 2.176 is stuck unclean for 1161.427407, current state active+remapped+backfilling, last acting [19,8,1]
pg 3.3b7 is stuck unclean for 888.662341, current state active+remapped+wait_backfill, last acting [8,14,22]
pg 3.173 is stuck unclean for 888.663097, current state active+remapped+wait_backfill, last acting [8,15,16]
pg 3.232 is stuck unclean for 888.662195, current state active+remapped+wait_backfill, last acting [8,20,3]
pg 2.233 is stuck unclean for 888.662216, current state active+remapped+wait_backfill, last acting [8,20,3]
pg 2.af is stuck unclean for 85410.264417, current state active+remapped+backfilling, last acting [6,8,3]
pg 3.168 is stuck unclean for 888.663251, current state active+remapped+wait_backfill, last acting [8,20,0]
pg 2.169 is stuck unclean for 888.663291, current state active+remapped+wait_backfill, last acting [8,20,0]
pg 2.ab is stuck unclean for 888.662356, current state active+remapped+wait_backfill, last acting [8,18,17]
pg 3.aa is stuck unclean for 888.662351, current state active+remapped+wait_backfill, last acting [8,18,17]
pg 3.228 is stuck unclean for 888.662337, current state active+remapped+wait_backfill, last acting [8,14,0]
pg 2.229 is stuck unclean for 888.662350, current state active+remapped+wait_backfill, last acting [8,14,0]
pg 2.22b is stuck unclean for 1161.770522, current state active+remapped+backfilling, last acting [18,8,0]
pg 2.161 is stuck unclean for 1161.839067, current state active+remapped+backfilling, last acting [17,8,7]
pg 2.3a4 is stuck unclean for 160095.566258, current state active+remapped+backfilling, last acting [18,8,0]
pg 2.3a7 is stuck unclean for 888.662256, current state active+remapped+wait_backfill, last acting [8,18,22]
pg 3.3a6 is stuck unclean for 888.662272, current state active+remapped+wait_backfill, last acting [8,18,22]
pg 2.a2 is stuck unclean for 889.669714, current state active+remapped+backfilling, last acting [6,8,7]
pg 2.467 is stuck unclean for 889.327384, current state active+remapped+backfilling, last acting [13,8,3]
pg 2.3a1 is stuck unclean for 159994.542551, current state active+remapped+backfilling, last acting [19,8,5]
pg 3.3a3 is stuck unclean for 1643146.397267, current state active+remapped+backfilling, last acting [18,8,0]
pg 2.158 is stuck unclean for 889.331607, current state active+remapped+backfilling, last acting [15,8,9]
pg 2.218 is stuck unclean for 1161.455144, current state active+remapped+backfilling, last acting [20,8,16]
pg 2.95 is stuck unclean for 888.662241, current state active+remapped+wait_backfill, last acting [8,13,0]
pg 3.94 is stuck unclean for 888.662236, current state active+remapped+wait_backfill, last acting [8,13,0]
pg 2.399 is stuck unclean for 1161.424873, current state active+remapped+backfilling, last acting [19,8,11]
pg 2.2da is stuck unclean for 889.325694, current state active+remapped+backfilling, last acting [10,8,22]
pg 2.458 is stuck unclean for 889.664588, current state active+remapped+backfilling, last acting [16,8,19]
pg 2.455 is stuck unclean for 888.663123, current state active+remapped+wait_backfill, last acting [8,18,17]
pg 2.397 is stuck unclean for 888.662229, current state active+remapped+wait_backfill, last acting [8,17,1]
pg 3.396 is stuck unclean for 888.662244, current state active+remapped+wait_backfill, last acting [8,17,1]
pg 2.152 is stuck unclean for 889.664670, current state active+remapped+backfilling, last acting [16,8,3]
pg 2.20c is stuck unclean for 889.325056, current state active+remapped+backfilling, last acting [14,8,20]
pg 2.44d is stuck unclean for 888.663004, current state active+remapped+wait_backfill, last acting [8,11,13]
pg 2.2ce is stuck unclean for 1161.838375, current state active+remapped+backfilling, last acting [17,8,13]
pg 3.20a is stuck unclean for 1161.425993, current state active+remapped+wait_backfill, last acting [19,8,0]
pg 2.20b is stuck unclean for 1161.424832, current state active+remapped+wait_backfill, last acting [19,8,0]
pg 2.2c8 is stuck unclean for 1161.427151, current state active+remapped+backfilling, last acting [19,8,9]
pg 2.20a is stuck unclean for 889.326613, current state active+remapped+backfilling, last acting [4,8,6]
pg 2.87 is stuck unclean for 1161.837786, current state active+remapped+backfilling, last acting [17,8,15]
pg 2.384 is stuck unclean for 889.324992, current state active+remapped+backfilling, last acting [14,8,5]
pg 2.2c0 is stuck unclean for 888.662914, current state active+remapped+wait_backfill, last acting [8,11,18]
pg 2.446 is stuck unclean for 1161.774298, current state active+remapped+backfilling, last acting [18,8,0]
pg 3.13c is stuck unclean for 888.662984, current state active+remapped+wait_backfill, last acting [8,16,14]
pg 2.13d is stuck unclean for 888.663018, current state active+remapped+backfilling, last acting [8,16,14]
pg 2.1fd is stuck unclean for 888.662197, current state active+remapped+backfilling, last acting [8,12,22]
pg 2.440 is stuck unclean for 195955.637808, current state active+remapped+backfilling, last acting [17,8,11]
pg 3.2bf is stuck unclean for 888.662780, current state active+remapped+wait_backfill, last acting [8,11,18]
pg 2.1f8 is stuck unclean for 889.326608, current state active+remapped+backfilling, last acting [11,8,22]
pg 2.135 is stuck unclean for 888.662853, current state active+remapped+backfilling, last acting [8,11,0]
pg 2.2b5 is stuck unclean for 889.325574, current state active+remapped+backfilling, last acting [10,8,7]
pg 2.436 is stuck unclean for 160015.918251, current state active+remapped+backfilling, last acting [17,8,3]
pg 2.4a0 is active+remapped+wait_backfill, acting [8,16,11]
pg 2.492 is active+remapped+backfilling, acting [16,8,22]
pg 2.478 is active+remapped+backfilling, acting [18,8,1]
pg 2.467 is active+remapped+backfilling, acting [13,8,3]
pg 2.458 is active+remapped+backfilling, acting [16,8,19]
pg 2.455 is active+remapped+wait_backfill, acting [8,18,17]
pg 2.44d is active+remapped+wait_backfill, acting [8,11,13]
pg 2.446 is active+remapped+backfilling, acting [18,8,0]
pg 2.440 is active+remapped+backfilling, acting [17,8,11]
pg 2.436 is active+remapped+backfilling, acting [17,8,3]
pg 2.420 is active+remapped+wait_backfill, acting [8,17,1]
pg 2.41a is active+remapped+wait_backfill, acting [20,8,0]
pg 2.40d is active+remapped+backfilling, acting [16,8,3]
pg 2.409 is active+remapped+backfilling, acting [4,8,1]
pg 2.405 is active+remapped+backfilling, acting [14,8,6]
pg 2.3f9 is active+remapped+backfilling, acting [14,8,10]
pg 2.3f7 is active+remapped+backfilling, acting [8,11,0]
pg 2.3f0 is active+remapped+backfilling, acting [5,8,10]
pg 2.3ef is active+remapped+backfilling, acting [19,8,7]
pg 2.3d9 is active+remapped+backfilling, acting [18,8,16]
pg 2.3d7 is active+remapped+backfilling, acting [11,8,0]
pg 3.3cd is active+remapped+wait_backfill, acting [8,11,14]
pg 2.3ce is active+remapped+wait_backfill, acting [8,11,14]
pg 3.3c8 is active+remapped+wait_backfill, acting [8,19,0]
pg 2.3c9 is active+remapped+wait_backfill, acting [8,19,0]
pg 3.3ca is active+remapped+wait_backfill, acting [8,19,22]
pg 2.3cb is active+remapped+wait_backfill, acting [8,19,22]
pg 2.3ca is active+remapped+backfilling, acting [20,8,15]
pg 2.3b8 is active+remapped+wait_backfill, acting [8,14,22]
pg 3.3b7 is active+remapped+wait_backfill, acting [8,14,22]
pg 2.3a4 is active+remapped+backfilling, acting [18,8,0]
pg 3.3a6 is active+remapped+wait_backfill, acting [8,18,22]
pg 2.3a7 is active+remapped+wait_backfill, acting [8,18,22]
pg 2.3a1 is active+remapped+backfilling, acting [19,8,5]
pg 3.3a3 is active+remapped+backfilling, acting [18,8,0]
pg 2.399 is active+remapped+backfilling, acting [19,8,11]
pg 3.396 is active+remapped+wait_backfill, acting [8,17,1]
pg 2.397 is active+remapped+wait_backfill, acting [8,17,1]
pg 2.384 is active+remapped+backfilling, acting [14,8,5]
pg 3.351 is active+remapped+wait_backfill, acting [8,11,17]
pg 2.352 is active+remapped+wait_backfill, acting [8,11,17]
pg 3.34e is active+remapped+wait_backfill, acting [8,18,9]
pg 2.34f is active+remapped+wait_backfill, acting [8,18,9]
pg 2.345 is active+remapped+backfilling, acting [6,8,15]
pg 2.342 is active+remapped+backfilling, acting [6,8,22]
pg 2.338 is active+remapped+backfilling, acting [19,8,13]
pg 2.32b is active+remapped+backfilling, acting [14,8,17]
pg 2.323 is active+remapped+backfilling, acting [13,8,10]
pg 2.317 is active+remapped+backfilling, acting [4,8,1]
pg 2.30c is active+remapped+backfilling, acting [8,15,0]
pg 2.30f is active+remapped+backfilling, acting [4,8,9]
pg 3.301 is active+remapped+wait_backfill, acting [8,19,9]
pg 2.302 is active+remapped+wait_backfill, acting [8,19,9]
pg 2.2f5 is active+remapped+backfilling, acting [8,12,7]
pg 2.2da is active+remapped+backfilling, acting [10,8,22]
pg 2.2ce is active+remapped+backfilling, acting [17,8,13]
pg 2.2c8 is active+remapped+backfilling, acting [19,8,9]
pg 2.2c0 is active+remapped+wait_backfill, acting [8,11,18]
pg 3.2bf is active+remapped+wait_backfill, acting [8,11,18]
pg 2.2b5 is active+remapped+backfilling, acting [10,8,7]
pg 2.29e is active+remapped+backfilling, acting [14,8,7]
pg 2.294 is active+remapped+wait_backfill, acting [8,17,20]
pg 3.293 is active+remapped+wait_backfill, acting [8,17,20]
pg 2.28b is incomplete, acting [6,5]
pg 2.270 is active+remapped+backfilling, acting [6,8,7]
pg 2.273 is active+remapped+backfilling, acting [18,8,1]
pg 2.25e is active+remapped+backfilling, acting [8,14,22]
pg 3.241 is active+remapped+wait_backfill, acting [8,18,9]
pg 2.242 is active+remapped+wait_backfill, acting [8,18,9]
pg 2.23c is active+remapped+backfilling, acting [14,8,0]
pg 2.233 is active+remapped+wait_backfill, acting [8,20,3]
pg 3.232 is active+remapped+wait_backfill, acting [8,20,3]
pg 2.229 is active+remapped+wait_backfill, acting [8,14,0]
pg 3.228 is active+remapped+wait_backfill, acting [8,14,0]
pg 2.22b is active+remapped+backfilling, acting [18,8,0]
pg 2.218 is active+remapped+backfilling, acting [20,8,16]
pg 2.20c is active+remapped+backfilling, acting [14,8,20]
pg 2.20b is active+remapped+wait_backfill, acting [19,8,0]
pg 3.20a is active+remapped+wait_backfill, acting [19,8,0]
pg 2.20a is active+remapped+backfilling, acting [4,8,6]
pg 2.1fd is active+remapped+backfilling, acting [8,12,22]
pg 2.1f8 is active+remapped+backfilling, acting [11,8,22]
pg 2.1e5 is active+remapped+wait_backfill, acting [8,20,22]
pg 3.1e4 is active+remapped+wait_backfill, acting [8,20,22]
pg 2.1e0 is active+remapped+wait_backfill, acting [8,19,1]
pg 3.1df is active+remapped+wait_backfill, acting [8,19,1]
pg 2.1d4 is active+remapped+backfilling, acting [8,13,3]
pg 2.1d0 is active+remapped+wait_backfill, acting [8,16,14]
pg 3.1cf is active+remapped+wait_backfill, acting [8,16,14]
pg 2.1cb is active+remapped+backfilling, acting [19,8,9]
pg 2.1ca is active+remapped+backfilling, acting [15,8,7]
pg 2.1bf is active+remapped+backfilling, acting [11,8,7]
pg 2.1b8 is active+remapped+backfilling, acting [15,8,5]
pg 2.1b7 is active+remapped+wait_backfill, acting [8,17,0]
pg 3.1b6 is active+remapped+wait_backfill, acting [8,17,0]
pg 3.1a9 is active+remapped+wait_backfill, acting [8,18,7]
pg 2.1aa is active+remapped+wait_backfill, acting [8,18,7]
pg 2.184 is active+remapped+backfilling, acting [13,8,9]
pg 2.183 is active+remapped+backfilling, acting [5,8,1]
pg 2.175 is active+remapped+backfilling, acting [4,8,9]
pg 2.174 is active+remapped+wait_backfill, acting [8,15,16]
pg 2.176 is active+remapped+backfilling, acting [19,8,1]
pg 3.173 is active+remapped+wait_backfill, acting [8,15,16]
pg 2.169 is active+remapped+wait_backfill, acting [8,20,0]
pg 3.168 is active+remapped+wait_backfill, acting [8,20,0]
pg 2.161 is active+remapped+backfilling, acting [17,8,7]
pg 2.158 is active+remapped+backfilling, acting [15,8,9]
pg 2.152 is active+remapped+backfilling, acting [16,8,3]
pg 2.13d is active+remapped+backfilling, acting [8,16,14]
pg 3.13c is active+remapped+wait_backfill, acting [8,16,14]
pg 2.135 is active+remapped+backfilling, acting [8,11,0]
pg 2.127 is active+remapped+backfilling, acting [5,8,11]
pg 2.118 is active+remapped+wait_backfill, acting [8,19,3]
pg 2.115 is active+remapped+backfilling, acting [8,14,3]
pg 3.114 is active+remapped+wait_backfill, acting [8,14,3]
pg 3.117 is active+remapped+wait_backfill, acting [8,19,3]
pg 2.fc is active+remapped+wait_backfill, acting [8,20,3]
pg 3.fb is active+remapped+wait_backfill, acting [8,20,3]
pg 2.cc is active+clean+inconsistent, acting [20,6]
pg 3.c0 is active+remapped+wait_backfill, acting [8,15,13]
pg 2.c1 is active+remapped+wait_backfill, acting [8,15,13]
pg 2.ba is active+remapped+backfilling, acting [12,8,1]
pg 2.af is active+remapped+backfilling, acting [6,8,3]
pg 3.aa is active+remapped+wait_backfill, acting [8,18,17]
pg 2.ab is active+remapped+wait_backfill, acting [8,18,17]
pg 2.a2 is active+remapped+backfilling, acting [6,8,7]
pg 3.94 is active+remapped+wait_backfill, acting [8,13,0]
pg 2.95 is active+remapped+wait_backfill, acting [8,13,0]
pg 2.87 is active+remapped+backfilling, acting [17,8,15]
pg 2.6b is active+remapped+backfilling, acting [19,8,4]
pg 3.62 is active+remapped+wait_backfill, acting [8,16,3]
pg 2.63 is active+remapped+wait_backfill, acting [8,16,3]
pg 3.56 is active+remapped+wait_backfill, acting [6,8,0]
pg 2.57 is active+remapped+backfilling, acting [6,8,0]
pg 2.4a is active+remapped+backfilling, acting [16,8,4]
pg 3.2d is active+remapped+wait_backfill, acting [8,15,3]
pg 2.2c is active+remapped+wait_backfill, acting [8,15,9]
pg 2.2e is active+remapped+wait_backfill, acting [8,15,3]
pg 3.2b is active+remapped+wait_backfill, acting [8,15,9]
pg 2.21 is active+remapped+backfilling, acting [18,8,13]
pg 2.1d is active+remapped+wait_backfill, acting [8,16,0]
pg 3.1c is active+remapped+wait_backfill, acting [8,16,0]
pg 2.1c is active+remapped+wait_backfill, acting [8,20,7]
pg 3.19 is active+remapped+wait_backfill, acting [8,13,0]
pg 2.1b is active+remapped+wait_backfill, acting [8,10,5]
pg 3.1a is active+remapped+wait_backfill, acting [8,10,5]
pg 2.1a is active+remapped+wait_backfill, acting [8,13,0]
pg 3.1b is active+remapped+wait_backfill, acting [8,20,7]
pg 2.13 is active+remapped+wait_backfill, acting [8,17,19]
pg 3.12 is active+remapped+wait_backfill, acting [8,17,19]
pg 2.b is active+remapped+backfilling, acting [8,11,22]
pg 2.2 is active+remapped+backfilling, acting [6,8,3]
recovery 459868/11663686 objects degraded (3.943%)
1 scrub errors
mds.picard at 10.42.6.21:6800/13626 is laggy/unresponsive








aaron@seven ~ $ ceph pg 2.28b query
{ "state": "incomplete",
  "epoch": 36361,
  "up": [
        6,
        5],
  "acting": [
        6,
        5],
  "info": { "pgid": "2.28b",
      "last_update": "35256'44286",
      "last_complete": "35256'44286",
      "log_tail": "34732'41286",
      "last_user_version": 0,
      "last_backfill": "84ed7a8b\/rbd_data.a623c2ae8944a.0000000000052a3a\/head\/\/2",
      "purged_snaps": "[]",
      "history": { "epoch_created": 1,
          "last_epoch_started": 36252,
          "last_epoch_clean": 34760,
          "last_epoch_split": 0,
          "same_up_since": 35405,
          "same_interval_since": 36276,
          "same_primary_since": 36274,
          "last_scrub": "34757'44284",
          "last_scrub_stamp": "2014-02-08 11:33:51.835956",
          "last_deep_scrub": "34757'44284",
          "last_deep_scrub_stamp": "2014-02-08 11:33:45.299503",
          "last_clean_scrub_stamp": "2014-02-08 11:33:51.835956"},
      "stats": { "version": "35256'44286",
          "reported_seq": "727",
          "reported_epoch": "36361",
          "state": "incomplete",
          "last_fresh": "2014-02-10 19:35:37.361600",
          "last_change": "2014-02-10 19:22:15.856289",
          "last_active": "0.000000",
          "last_clean": "0.000000",
          "last_became_active": "0.000000",
          "last_unstale": "2014-02-10 19:35:37.361600",
          "mapping_epoch": 36274,
          "log_start": "34732'41286",
          "ondisk_log_start": "34732'41286",
          "created": 1,
          "last_epoch_clean": 34760,
          "parent": "0.0",
          "parent_split_bits": 0,
          "last_scrub": "34757'44284",
          "last_scrub_stamp": "2014-02-08 11:33:51.835956",
          "last_deep_scrub": "34757'44284",
          "last_deep_scrub_stamp": "2014-02-08 11:33:45.299503",
          "last_clean_scrub_stamp": "2014-02-08 11:33:51.835956",
          "log_size": 3000,
          "ondisk_log_size": 3000,
          "stats_invalid": "0",
          "stat_sum": { "num_bytes": 13767208960,
              "num_objects": 3306,
              "num_object_clones": 0,
              "num_object_copies": 6612,
              "num_objects_missing_on_primary": 0,
              "num_objects_degraded": 0,
              "num_objects_unfound": 0,
              "num_objects_dirty": 3300,
              "num_whiteouts": 0,
              "num_read": 0,
              "num_read_kb": 0,
              "num_write": 0,
              "num_write_kb": 0,
              "num_scrub_errors": 0,
              "num_shallow_scrub_errors": 0,
              "num_deep_scrub_errors": 0,
              "num_objects_recovered": 0,
              "num_bytes_recovered": 0,
              "num_keys_recovered": 0},
          "stat_cat_sum": {},
          "up": [
                6,
                5],
          "acting": [
                6,
                5]},
      "empty": 0,
      "dne": 0,
      "incomplete": 1,
      "last_epoch_started": 36252,
      "hit_set_history": { "current_last_update": "0'0",
          "current_last_stamp": "0.000000",
          "current_info": { "begin": "0.000000",
              "end": "0.000000",
              "version": "0'0"},
          "history": []}},
  "peer_info": [
        { "peer": 5,
          "pgid": "2.28b",
          "last_update": "34757'44284",
          "last_complete": "34757'44284",
          "log_tail": "34732'41284",
          "last_user_version": 0,
          "last_backfill": "84ed7a8b\/rbd_data.a623c2ae8944a.0000000000052a3a\/head\/\/2",
          "purged_snaps": "[]",
          "history": { "epoch_created": 1,
              "last_epoch_started": 36252,
              "last_epoch_clean": 34760,
              "last_epoch_split": 0,
              "same_up_since": 35405,
              "same_interval_since": 36276,
              "same_primary_since": 36274,
              "last_scrub": "34757'44284",
              "last_scrub_stamp": "2014-02-08 11:33:51.835956",
              "last_deep_scrub": "34757'44284",
              "last_deep_scrub_stamp": "2014-02-08 11:33:45.299503",
              "last_clean_scrub_stamp": "2014-02-08 11:33:51.835956"},
          "stats": { "version": "34757'44284",
              "reported_seq": "247",
              "reported_epoch": "35404",
              "state": "down+peering",
              "last_fresh": "2014-02-09 21:05:56.090968",
              "last_change": "2014-02-09 21:05:33.224591",
              "last_active": "0.000000",
              "last_clean": "0.000000",
              "last_became_active": "0.000000",
              "last_unstale": "2014-02-09 21:05:56.090968",
              "mapping_epoch": 36274,
              "log_start": "34732'41284",
              "ondisk_log_start": "34732'41284",
              "created": 1,
              "last_epoch_clean": 34760,
              "parent": "0.0",
              "parent_split_bits": 0,
              "last_scrub": "34757'44284",
              "last_scrub_stamp": "2014-02-08 11:33:51.835956",
              "last_deep_scrub": "34757'44284",
              "last_deep_scrub_stamp": "2014-02-08 11:33:45.299503",
              "last_clean_scrub_stamp": "2014-02-08 11:33:51.835956",
              "log_size": 3000,
              "ondisk_log_size": 3000,
              "stats_invalid": "0",
              "stat_sum": { "num_bytes": 13771403264,
                  "num_objects": 3307,
                  "num_object_clones": 0,
                  "num_object_copies": 6614,
                  "num_objects_missing_on_primary": 0,
                  "num_objects_degraded": 0,
                  "num_objects_unfound": 0,
                  "num_objects_dirty": 0,
                  "num_whiteouts": 0,
                  "num_read": 0,
                  "num_read_kb": 0,
                  "num_write": 0,
                  "num_write_kb": 0,
                  "num_scrub_errors": 0,
                  "num_shallow_scrub_errors": 0,
                  "num_deep_scrub_errors": 0,
                  "num_objects_recovered": 0,
                  "num_bytes_recovered": 0,
                  "num_keys_recovered": 0},
              "stat_cat_sum": {},
              "up": [
                    6,
                    5],
              "acting": [
                    6,
                    5]},
          "empty": 0,
          "dne": 0,
          "incomplete": 1,
          "last_epoch_started": 35110,
          "hit_set_history": { "current_last_update": "0'0",
              "current_last_stamp": "0.000000",
              "current_info": { "begin": "0.000000",
                  "end": "0.000000",
                  "version": "0'0"},
              "history": []}},
        { "peer": 8,
          "pgid": "2.28b",
          "last_update": "35256'44286",
          "last_complete": "35256'44286",
          "log_tail": "34732'41284",
          "last_user_version": 44286,
          "last_backfill": "a8dd7a8b\/benchmark_data_seven_910_object168\/head\/\/2",
          "purged_snaps": "[]",
          "history": { "epoch_created": 1,
              "last_epoch_started": 35225,
              "last_epoch_clean": 34760,
              "last_epoch_split": 0,
              "same_up_since": 35405,
              "same_interval_since": 36276,
              "same_primary_since": 36274,
              "last_scrub": "34757'44284",
              "last_scrub_stamp": "2014-02-08 11:33:51.835956",
              "last_deep_scrub": "34757'44284",
              "last_deep_scrub_stamp": "2014-02-08 11:33:45.299503",
              "last_clean_scrub_stamp": "2014-02-08 11:33:51.835956"},
          "stats": { "version": "35256'44286",
              "reported_seq": "109",
              "reported_epoch": "35310",
              "state": "peering",
              "last_fresh": "2014-02-09 19:52:07.683337",
              "last_change": "2014-02-09 19:52:07.683337",
              "last_active": "0.000000",
              "last_clean": "0.000000",
              "last_became_active": "0.000000",
              "last_unstale": "2014-02-09 19:52:07.683337",
              "mapping_epoch": 36274,
              "log_start": "34732'41284",
              "ondisk_log_start": "34732'41284",
              "created": 1,
              "last_epoch_clean": 34760,
              "parent": "0.0",
              "parent_split_bits": 0,
              "last_scrub": "34757'44284",
              "last_scrub_stamp": "2014-02-08 11:33:51.835956",
              "last_deep_scrub": "34757'44284",
              "last_deep_scrub_stamp": "2014-02-08 11:33:45.299503",
              "last_clean_scrub_stamp": "2014-02-08 11:33:51.835956",
              "log_size": 3002,
              "ondisk_log_size": 3002,
              "stats_invalid": "0",
              "stat_sum": { "num_bytes": 13763014656,
                  "num_objects": 3305,
                  "num_object_clones": 0,
                  "num_object_copies": 0,
                  "num_objects_missing_on_primary": 0,
                  "num_objects_degraded": 0,
                  "num_objects_unfound": 0,
                  "num_objects_dirty": 0,
                  "num_whiteouts": 0,
                  "num_read": 0,
                  "num_read_kb": 0,
                  "num_write": 0,
                  "num_write_kb": 0,
                  "num_scrub_errors": 0,
                  "num_shallow_scrub_errors": 0,
                  "num_deep_scrub_errors": 0,
                  "num_objects_recovered": 0,
                  "num_bytes_recovered": 0,
                  "num_keys_recovered": 0},
              "stat_cat_sum": {},
              "up": [
                    6,
                    5],
              "acting": [
                    6,
                    5]},
          "empty": 0,
          "dne": 0,
          "incomplete": 1,
          "last_epoch_started": 35225,
          "hit_set_history": { "current_last_update": "0'0",
              "current_last_stamp": "0.000000",
              "current_info": { "begin": "0.000000",
                  "end": "0.000000",
                  "version": "0'0"},
              "history": []}}],
  "recovery_state": [
        { "name": "Started\/Primary\/Peering",
          "enter_time": "2014-02-10 19:22:15.855010",
          "past_intervals": [
                { "first": 34758,
                  "last": 34796,
                  "maybe_went_rw": 1,
                  "up": [
                        21],
                  "acting": [
                        21]},
                { "first": 34797,
                  "last": 34899,
                  "maybe_went_rw": 1,
                  "up": [
                        21,
                        5],
                  "acting": [
                        21,
                        5]},
                { "first": 34900,
                  "last": 34946,
                  "maybe_went_rw": 1,
                  "up": [
                        5],
                  "acting": [
                        5]},
                { "first": 34947,
                  "last": 34952,
                  "maybe_went_rw": 1,
                  "up": [
                        21,
                        5],
                  "acting": [
                        21,
                        5]},
                { "first": 34953,
                  "last": 34957,
                  "maybe_went_rw": 1,
                  "up": [
                        5],
                  "acting": [
                        5]},
                { "first": 34958,
                  "last": 34959,
                  "maybe_went_rw": 1,
                  "up": [
                        21,
                        5],
                  "acting": [
                        21,
                        5]},
                { "first": 34960,
                  "last": 35053,
                  "maybe_went_rw": 1,
                  "up": [
                        5],
                  "acting": [
                        5]},
                { "first": 35054,
                  "last": 35055,
                  "maybe_went_rw": 1,
                  "up": [
                        21,
                        5],
                  "acting": [
                        21,
                        5]},
                { "first": 35056,
                  "last": 35062,
                  "maybe_went_rw": 1,
                  "up": [
                        5],
                  "acting": [
                        5]},
                { "first": 35063,
                  "last": 35065,
                  "maybe_went_rw": 1,
                  "up": [
                        21,
                        5],
                  "acting": [
                        21,
                        5]},
                { "first": 35066,
                  "last": 35068,
                  "maybe_went_rw": 1,
                  "up": [
                        5],
                  "acting": [
                        5]},
                { "first": 35069,
                  "last": 35071,
                  "maybe_went_rw": 1,
                  "up": [
                        21,
                        5],
                  "acting": [
                        21,
                        5]},
                { "first": 35072,
                  "last": 35108,
                  "maybe_went_rw": 1,
                  "up": [
                        5],
                  "acting": [
                        5]},
                { "first": 35109,
                  "last": 35112,
                  "maybe_went_rw": 1,
                  "up": [
                        21,
                        5],
                  "acting": [
                        21,
                        5]},
                { "first": 35113,
                  "last": 35120,
                  "maybe_went_rw": 1,
                  "up": [
                        5],
                  "acting": [
                        5]},
                { "first": 35121,
                  "last": 35160,
                  "maybe_went_rw": 1,
                  "up": [
                        8,
                        5],
                  "acting": [
                        8,
                        5]},
                { "first": 35161,
                  "last": 35174,
                  "maybe_went_rw": 1,
                  "up": [
                        8],
                  "acting": [
                        8]},
                { "first": 35175,
                  "last": 35181,
                  "maybe_went_rw": 1,
                  "up": [
                        8,
                        5],
                  "acting": [
                        8,
                        5]},
                { "first": 35182,
                  "last": 35194,
                  "maybe_went_rw": 0,
                  "up": [
                        8],
                  "acting": [
                        8]},
                { "first": 35195,
                  "last": 35214,
                  "maybe_went_rw": 0,
                  "up": [
                        8,
                        5],
                  "acting": [
                        8,
                        5]},
                { "first": 35215,
                  "last": 35222,
                  "maybe_went_rw": 1,
                  "up": [
                        5],
                  "acting": [
                        5]},
                { "first": 35223,
                  "last": 35223,
                  "maybe_went_rw": 0,
                  "up": [
                        8,
                        5],
                  "acting": [
                        8,
                        5]},
                { "first": 35224,
                  "last": 35264,
                  "maybe_went_rw": 1,
                  "up": [
                        6,
                        5],
                  "acting": [
                        21,
                        8]},
                { "first": 35265,
                  "last": 35265,
                  "maybe_went_rw": 0,
                  "up": [
                        6,
                        5],
                  "acting": [
                        8]},
                { "first": 35266,
                  "last": 35287,
                  "maybe_went_rw": 1,
                  "up": [
                        6,
                        5],
                  "acting": [
                        6,
                        5]},
                { "first": 35288,
                  "last": 35299,
                  "maybe_went_rw": 1,
                  "up": [
                        6],
                  "acting": [
                        6]},
                { "first": 35300,
                  "last": 35303,
                  "maybe_went_rw": 1,
                  "up": [
                        6,
                        5],
                  "acting": [
                        6,
                        5]},
                { "first": 35304,
                  "last": 35305,
                  "maybe_went_rw": 1,
                  "up": [
                        6],
                  "acting": [
                        6]},
                { "first": 35306,
                  "last": 35376,
                  "maybe_went_rw": 1,
                  "up": [
                        6,
                        5],
                  "acting": [
                        6,
                        5]},
                { "first": 35377,
                  "last": 35386,
                  "maybe_went_rw": 0,
                  "up": [
                        5],
                  "acting": [
                        5]},
                { "first": 35387,
                  "last": 35396,
                  "maybe_went_rw": 0,
                  "up": [],
                  "acting": []},
                { "first": 35397,
                  "last": 35404,
                  "maybe_went_rw": 1,
                  "up": [
                        5],
                  "acting": [
                        5]},
                { "first": 35405,
                  "last": 35407,
                  "maybe_went_rw": 1,
                  "up": [
                        6,
                        5],
                  "acting": [
                        6,
                        5]},
                { "first": 35408,
                  "last": 35616,
                  "maybe_went_rw": 1,
                  "up": [
                        6,
                        5],
                  "acting": [
                        21,
                        6]},
                { "first": 35617,
                  "last": 35618,
                  "maybe_went_rw": 0,
                  "up": [
                        6,
                        5],
                  "acting": [
                        6]},
                { "first": 35619,
                  "last": 36246,
                  "maybe_went_rw": 1,
                  "up": [
                        6,
                        5],
                  "acting": [
                        6,
                        5]},
                { "first": 36247,
                  "last": 36248,
                  "maybe_went_rw": 1,
                  "up": [
                        6,
                        5],
                  "acting": [
                        21,
                        6]},
                { "first": 36249,
                  "last": 36249,
                  "maybe_went_rw": 0,
                  "up": [
                        6,
                        5],
                  "acting": [
                        6]},
                { "first": 36250,
                  "last": 36250,
                  "maybe_went_rw": 0,
                  "up": [
                        6,
                        5],
                  "acting": [
                        6,
                        5]},
                { "first": 36251,
                  "last": 36273,
                  "maybe_went_rw": 1,
                  "up": [
                        6,
                        5],
                  "acting": [
                        21,
                        6]},
                { "first": 36274,
                  "last": 36275,
                  "maybe_went_rw": 0,
                  "up": [
                        6,
                        5],
                  "acting": [
                        6]}],
          "probing_osds": [
                5,
                6],
          "down_osds_we_would_probe": [
                21],
          "peering_blocked_by": []},
        { "name": "Started",
          "enter_time": "2014-02-10 19:22:15.854966"}]}


--
Aaron Ten Clay
http://www.aarontc.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux