On Thu, 14 Jun 2018, Wyllys Ingersoll wrote: > Yes, we lost several disks recently and they were all removed probably > faster than they should have been (i.e. we didnt wait for them to > rebalance individually before removing more). > > Is there any way to map an object or pg to a cephfs file so at least > we will know which files are going to be corrupted if we mark them > complete? No... the cluster doesn't know what objects where in the PG if the PG is incomplete. It doesn't keep a parallel record of what would have been stored. I'd try dig up the removed disks... sage > > On Thu, Jun 14, 2018 at 12:13 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > > On Thu, 14 Jun 2018, Wyllys Ingersoll wrote: > >> I cut out a HUGE list of "purged_snaps" to keep this a little shorter... > >> > >> $ cat 1.10e.txt > >> { > >> "state": "incomplete", > >> "snap_trimq": "[]", > >> "snap_trimq_len": 0, > >> "epoch": 465904, > >> "up": [ > >> 52, > >> 23, > >> 20 > >> ], > >> "acting": [ > >> 52, > >> 23, > >> 20 > >> ], > >> "info": { > >> "pgid": "1.10e", > >> "last_update": "438490'293946", > >> "last_complete": "438490'293946", > >> "log_tail": "427182'292446", > >> "last_user_version": 0, > >> "last_backfill": "MIN", > > ... > >> "peer_info": [ > >> { > >> "peer": "5", > >> "pgid": "1.10e", > >> "last_update": "438490'293946", > >> "last_complete": "438490'293946", > >> "log_tail": "427182'292446", > >> "last_user_version": 0, > >> "last_backfill": "MIN", > > ... > >> }, > >> { > >> "peer": "10", > >> "pgid": "1.10e", > >> "last_update": "438490'293946", > >> "last_complete": "438490'293946", > >> "log_tail": "427182'292446", > >> "last_user_version": 0, > >> "last_backfill": "MIN", > > ... > >> } > >> ], > > > > It looks like all of the copies of this PG are in fact incomplete > > (partially backfilled, not the complete set of objects). You must have > > lost a disk somewhere? Is there another copy? > > > > If not, then as a last resort you can go look at each one, see which copy > > of the PG has the most objects, and mark it complete, and everything else > > where backfill from there. That is almost certainly going to admit > > defeat and lose some data, though. > > > > sage > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html