If you have a backfillfull, no pg's will be able to migrate. Better is to just add harddrives, because at least one of your osd's is to full. I know you can set the backfillfull ratio's with commands like these ceph tell osd.* injectargs '--mon_osd_full_ratio=0.970000' ceph tell osd.* injectargs '--mon_osd_backfillfull_ratio=0.950000' ceph tell osd.* injectargs '--mon_osd_full_ratio=0.950000' ceph tell osd.* injectargs '--mon_osd_backfillfull_ratio=0.900000' Or maybe decrease the weight of the full osd, check the osds with 'ceph osd status' and make sure your nodes have even distribution of the storage. -----Original Message----- From: Rik [mailto:rik@xxxxxxxxxx] Sent: zondag 20 januari 2019 8:47 To: ceph-users@xxxxxxxxxxxxxx Subject: Salvage CEPHFS after lost PG Hi all, I'm looking for some suggestions on how to do something inappropriate. In a nutshell, I've lost the WAL/DB for three bluestore OSDs on a small cluster and, as a result of those three OSDs going offline, I've lost a placement group (7.a7). How I achieved this feat is an embarrassing mistake, which I don't think has bearing on my question. The OSDs were created a few months ago with ceph-deploy: /usr/local/bin/ceph-deploy --overwrite-conf osd create --bluestore --data /dev/vdc1 --block-db /dev/vdf1 ceph-a With the 3 OSDs out, I'm sitting at OSD_BACKFILLFULL. First, the PG 7.a7 belongs to the data pool, rather than the metadata pool and if I run "cephfs-data-scan pg_files / 7.a7" then I get a list of 4149 files/objects but then it hangs. I don't understand why this would hang if it's only the data pool which is impacted (since pg_files only operates on the metadata pool?). The ceph-log shows: cluster [WRN] slow request 30.894832 seconds old, received at 2019-01-20 18:00:12.555398: client_request(client.25017730:21 8006 lookup #0x10001c8ce15/000001 2019-01-20 18:00:12.550421 caller_uid=0, caller_gid=0{}) currently failed to rdlock, waiting Is the hang perhaps related to the OSD_BACKFILLFULL? If so, I could add some completely new OSDs to fix that problem. I have held off doing that for now as that will trigger a whole lot of data movement which might be unnecessary. Or is the hang indeed related to the missing PG? Second, if I try to copy files out of the CEPHFS filesystem, I get a few hundred files and then it too hangs. None of the files I’m attempting to copy are listed in the pg_files output (although since the pg_files hangs, perhaps it hadn't got to those files yet). Again, should I not be able to access files which are not associated with the a missing data pool PG? Lastly, I want to know if there is some way to recreate the WAL/DB while leaving the OSD data intact and/or fool one of the OSDs into thinking everything is OK, allowing it to serve up the data it has in the missing PG. >From reading the mailing list and documentation, I know that this is not a "safe" operation: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021713.html http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-January/024268.html However, my current status indicates an unusable CEPHFS and limited access to the data. I'd like to get as much data off it as possible and then I expect to have to recreate it. With a combination of the backups I have and what I can salvage from the cluster, I should hopefully have most of what I need. I know what I *should* have done, but now I'm at this point, I know I'm asking for something which would never be required on a properly-run cluster. If it really is not possible to get the (possibly corrupt) PG back again, can I get the cluster back so the remainder of the files are accessible? Currently running mimic 13.2.4 on all nodes. Status: $ ceph health detail - https://gist.github.com/kawaja/f59d231179b3186748eca19aae26bcd4 $ ceph fs get main - https://gist.github.com/kawaja/a7ab0b285d53dee6a950a4310be4fa5a Any advice on where I could go from here would be greatly appreciated. thanks, rik. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com