Hello people,
after a series on events and some operational mistakes, 1 PG in our
cluster is
in active+recovering+degraded+remapped state, reporting 1 unfound
object.
We're running Hammer (v0.94.9) on top of Debian Jessie, on 27 nodes and
162
osds with the default crushmap and nodeep-scrub flag set. Unfortunately,
our
pools on our cluster are all set up with replica size = 2 and min_size =
1.
My main problem is that ceph pg <pg> list_missing does not report which
objects
are considered unfound, making it quite difficult to understand what is
happening and how to recover without doing any more damage.
Specifically, the
output of the command is this:
# ceph pg 5.658 list_missing
{
"offset": {
"oid": "",
"key": "",
"snapid": 0,
"hash": 0,
"max": 0,
"pool": -1,
"namespace": ""
},
"num_missing": 0,
"num_unfound": 1,
"objects": [],
"more": 0
}
I took a look on ceph's official docs and on older threads on this list,
but on
every case that I found, ceph was reporting the objects that it could
not find.
Our cluster got into that state after a series of events and mistakes. I
will
provide some timestamps too.
* osds of one node where down+out because of a recent failure (6 osds)
* We decided to start one osd (osd.120) to see how it will behave
* At 14:56:06 we start osd.120
* After starting osd.120, we noticed that recovery starts. As I
understand now,
we did not want the osd to join the cluster, so we decided to take it
down
again. It seems to me now that this looked like a panic move, but
anyway, it
happeded.
* At 14:57:23 we shutdown osd.120.
* Some pgs that were mapped on osd.120 are reported to be down and stuck
requests targeting those osds are popping up. Of course, that meant that
we
needed to start the osd again.
* At 15:02:59 we start osd.120. PGs are getting up and start peering.
* At 15:03:24, osd.33 (living on a different node) crashes with the
following
assertion:
0> 2017-09-08 15:03:24.041412 7ff679fa4700 -1 osd/ReplicatedPG.cc: In
function 'virtual void ReplicatedPG::on_local_recover(const hobject_t&,
const object_stat_sum_t&, const ObjectRecoveryInfo&, ObjectContextRef,
ObjectStore::Transaction*)' thread 7ff679fa4700 time 2017-09-08
15:03:24.002997
osd/ReplicatedPG.cc: 211: FAILED assert(is_primary())
* At 15:03:29 cluster reports that 1 object is unfound. We start
investigating
the issue.
* After some time, we noticed that pgs mapped to osd.33 are degraded, so
we
decide to start osd.33 again. It seems to start normally without any
issues.
* After some time, recovery almost finishes, with all pgs being in a
healthy
state, except pg 5.658, which should contain the unfound object.
Our cluster is now in the following state:
# ceph -s
cluster 287f8859-9887-4bb3-ae27-531d2a1dbc95
health HEALTH_WARN
1 pgs degraded
1 pgs recovering
1 pgs stuck degraded
1 pgs stuck unclean
recovery 13/74653914 objects degraded (0.000%)
recovery 300/74653914 objects misplaced (0.000%)
recovery 1/37326882 unfound (0.000%)
nodeep-scrub flag(s) set
monmap e1: 3 mons at
{rd0-00=some_ip:6789/0,rd0-01=some_ip2:6789/0,rd0-02=some_ip3:6789/0}
election epoch 5462, quorum 0,1,2 rd0-00,rd0-01,rd0-02
osdmap e379262: 162 osds: 157 up, 157 in; 1 remapped pgs
flags nodeep-scrub
pgmap v135824695: 18432 pgs, 5 pools, 98880 GB data, 36452
kobjects
193 TB used, 89649 GB / 280 TB avail
13/74653914 objects degraded (0.000%)
300/74653914 objects misplaced (0.000%)
1/37326882 unfound (0.000%)
18430 active+clean
1 active+recovering+degraded+remapped
1 active+clean+scrubbing
client io 9776 kB/s rd, 10937 kB/s wr, 863 op/s
# ceph health detail
HEALTH_WARN 1 pgs degraded; 1 pgs recovering; 1 pgs stuck degraded; 1
pgs stuck unclean; recovery 13/74653918 objects degraded (0.000%);
recovery 300/74653918 objects misplaced (0.000%); recovery 1/37326884
unfound (0.000%); nodeep-scrub flag(s) set
pg 5.658 is stuck unclean for 541763.344743, current state
active+recovering+degraded+remapped, last acting [120,155]
pg 5.658 is stuck degraded for 201445.628108, current state
active+recovering+degraded+remapped, last acting [120,155]
pg 5.658 is active+recovering+degraded+remapped, acting [120,155], 1
unfound
recovery 13/74653918 objects degraded (0.000%)
recovery 300/74653918 objects misplaced (0.000%)
recovery 1/37326884 unfound (0.000%)
nodeep-scrub flag(s) set
# ceph pg dump_stuck unclean
ok
pg_stat state up up_primary acting acting_primary
5.658 active+recovering+degraded+remapped [120,153] 120
[120,155] 120
# ceph pg 5.658 query
Output be found here [1].
Also, we took a glance at logs but did not noticed anything strange
except the
crashed osd and it's error messages. Unfortunately, we did not
investigate logs
further yet and did not look more into the crashed osd (osd.33).
Are there cases where a ceph cluster can report unfound objects, without
even
knowing which they are? Is that behavior expected or did we hit a bug?
Has
anyone encountered anything similar? If yes, how did you interpret the
output
of the command and how did you proceed in order to return the pg and the
cluster
to a healthy state?
Best regards,
Nikos.
[1] https://pithos.okeanos.grnet.gr/public/fxrzW3tJYa8v7rPpcYxbF1
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com