We're looking for some assistance recovering data from a failed ceph cluster; or some help determining if it is even possible to recover any data. Background: - We were using Ceph with Proxmox following the instructions Proxmox provides (https://pve.proxmox.com/wiki/Ceph_Server); which seems fairly close to the ceph recommendations except that the storage is on the same physical systems that virtual machines are running on. - Some of our Proxmox nodes use ZFS, and there is a rare bug where ZFS + Proxmox clustering can result in Proxmox hanging indefinitely - We were using HA on our proxmox nodes, which means when they hang, they are rebooted (hard) automatically - Hard reboots are bad for file systems - Hard reboots mean that Ceph tries to recover - meaning more systems hitting the bug followed by more system restarts and general mayhem We first ran into issues overnight; and at some point during the process one of the file systems on an OSD was corrupted. We managed to stabilize the systems, however we've not been able to recover the critical data from the pool (about 5-10%). Current cluster health: cluster 537a3e12-95d8-48c3-9e82-91abbfdf62e0 health HEALTH_WARN 5 pgs degraded 8 pgs down 48 pgs incomplete 3 pgs recovering 1 pgs recovery_wait 76 pgs stale 5 pgs stuck degraded 48 pgs stuck inactive 76 pgs stuck stale 53 pgs stuck unclean 5 pgs stuck undersized 5 pgs undersized 74 requests are blocked > 32 sec recovery 14656/6951979 objects degraded (0.211%) recovery 20585/6951979 objects misplaced (0.296%) recovery 5/3348270 unfound (0.000%) monmap e7: 7 mons at {0=10.11.0.126:6789/0,1=10.11.0.125:6789/0,2=10.11.0.124:6789/0,3=10.11.0.123:6789/0,4=10.11.0.122:6789/0,5=10.11.0.119:6789/0,6=10.11.0.121:6789/0} election epoch 482, quorum 0,1,2,3,4,5,6 5,6,4,3,2,1,0 osdmap e15746: 16 osds: 16 up, 16 in; 5 remapped pgs pgmap v10200890: 3072 pgs, 3 pools, 12914 GB data, 3269 kobjects 26923 GB used, 23327 GB / 50250 GB avail 14656/6951979 objects degraded (0.211%) 20585/6951979 objects misplaced (0.296%) 5/3348270 unfound (0.000%) 2943 active+clean 76 stale+active+clean 40 incomplete 8 down+incomplete 3 active+recovering+undersized+degraded+remapped 1 active+recovery_wait+undersized+degraded+remapped 1 active+undersized+degraded+remapped There are two RBD's which we are looking to recover (out of about 130), totalling about 200GB of data. Those RBDs do not appear to be using any of the PGs which are incomplete or down; but do seem to use ones which are stale+active+clean and so if we read from the mapped RBD it will block indefinitely. We were looking at http://ceph.com/community/incomplete-pgs-oh-my/ as a means of recovering the incomplete PGs as it does seem that the complete ones are on the corrupted OSD, and most or all were able to be exported without issue; however I'm not sure if this is the correct way to go or if I should be looking at something else. --
DISCLAIMER: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received it by mistake, please let us know by email reply and delete it from your system; you should not disseminate, distribute or copy this email. |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com