Thanks you for all the help Wido:
This might be the root of our problems. We didn't mark the parent OSD as "lost" before we removed it. Now ceph won't let us mark it as lost (and it is no longer in the OSD tree): djakubiec@dev:~$ ceph osd lost 8 --yes-i-really-mean-it osd.8 is not down or doesn't exist djakubiec@dev:~$ ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 58.19960 root default -2 7.27489 host node24 1 7.27489 osd.1 up 1.00000 1.00000 -3 7.27489 host node25 2 7.27489 osd.2 up 1.00000 1.00000 -4 7.27489 host node26 3 7.27489 osd.3 up 1.00000 1.00000 -5 7.27489 host node27 4 7.27489 osd.4 up 1.00000 1.00000 -6 7.27489 host node28 5 7.27489 osd.5 up 1.00000 1.00000 -7 7.27489 host node29 6 7.27489 osd.6 up 1.00000 1.00000 -8 7.27539 host node30 9 7.27539 osd.9 up 1.00000 1.00000 -9 7.27489 host node31 7 7.27489 osd.7 up 1.00000 1.00000 BUT, even though OSD 8 no longer exists I see still lots of references to OSD 8 in various dumps and query's. Interestingly do still see weird entries in the CRUSH map (should I do something about these?): # devices device 0 device0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 osd.4 device 5 osd.5 device 6 osd.6 device 7 osd.7 device 8 device8 device 9 osd.9 I then tried on all 80 incomplete PGs: ceph pg force_create_pg <pgid> The 80 PGs moved to "creating" for a few minutes but then all went back to "incomplete". Is there some way to force individual PGs to be marked as "lost"? Thanks! -- Dan
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com