On Fri, Jul 21, 2017 at 10:23 PM Daniel K <sathackr@xxxxxxxxx> wrote:
Luminous 12.1.0(RC)I replaced two OSD drives(old ones were still good, just too small), using:ceph osd out osd.12ceph osd crush remove osd.12ceph auth del osd.12systemctl stop ceph-osd@osd.12ceph osd rm osd.12I later found that I also should have unmounted it from /var/lib/ceph/osd-12(remove old disk, insert new disk)I added the new disk/osd with ceph-deploy osd prepare stor-vm3:sdg --bluestoreThis automatically activated the osd (not sure why, I thought it needed a ceph-deploy osd activate as well)Then, working on an unrelated issue, I upgraded one (out of 4 total) nodes to 12.1.1 using apt and rebooted.The mon daemon would not form a quorum with the others on 12.1.0, so, instead of troubleshooting that, I just went ahead and upgraded the other 3 nodes and rebooted.Lots of recovery IO went on afterwards, but now things have stopped at:pools: 10 pools, 6804 pgsobjects: 1784k objects, 7132 GBusage: 11915 GB used, 19754 GB / 31669 GB availpgs: 0.353% pgs not active70894/2988573 objects degraded (2.372%)422090/2988573 objects misplaced (14.123%)6626 active+clean129 active+remapped+backfill_wait23 incomplete14 active+undersized+degraded+remapped+backfill_wait4 active+undersized+degraded+remapped+backfilling4 active+remapped+backfilling2 active+clean+scrubbing+deep1 peering1 active+recovery_wait+degraded+remappedwhen I run ceph pg query on the incompletes, they all list at least one of the two removed OSDs(12,17) in "down_osds_we_would_probe"most pools are size:2 min_size 1(trusting bluestore to tell me which one is valid). One pool is size:1 min size:1 and I'm okay with losing it, except I had it mounted in a directory on cephfs, I rm'd the directory but I can't delete the pool because it's "in use by CephFS"I still have the old drives, can I stick them into another host and re-add them somehow?
Yes, that'll probably be your easiest solution. You may have some trouble because you already deleted them, but I'm not sure.
Alternatively, you ought to be able to remove the pool from CephFS using some of the monitor commands and then delete it.
_______________________________________________This data isn't super important, but I'd like to learn a bit on how to recover when bad things happen as we are planning a production deployment in a couple of weeks.
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com