Cześć, > On Thu, 25 May 2017, Łukasz Chrustek wrote: >> Cześć, >> >> > On Wed, 24 May 2017, Łukasz Chrustek wrote: >> >> Hello, >> >> >> >> >> >> >> >> > This >> >> >> >> >> >> osd 6 - isn't startable >> >> >> >> > Disk completely 100% dead, or just borken enough that ceph-osd won't >> >> > start? ceph-objectstore-tool can be used to extract a copy of the 2 pgs >> >> > from this osd to recover any important writes on that osd. >> >> >> >> >> osd 10, 37, 72 are startable >> >> >> >> > With those started, I'd repeat the original sequence and get a fresh pg >> >> > query to confirm that it still wants just osd.6. >> >> >> >> > use ceph-objectstore-tool to export the pg from osd.6, stop some other >> >> > ranodm osd (not one of these ones), import the pg into that osd, and start >> >> > again. once it is up, 'ceph osd lost 6'. the pg *should* peer at that >> >> > point. repeat with the same basic process with the other pg. >> >> >> >> Here is output from ceph-objectstore-tool - also didn't success: >> >> >> >> https://pastebin.com/7XGAHdKH >> >> > Hmm, btrfs: >> >> > 2017-05-24 23:28:58.547456 7f500948e940 -1 >> > filestore(/var/lib/ceph/osd/ceph-84) ERROR: >> > /var/lib/ceph/osd/ceph-84/current/nosnap exists, not rolling back to avoid >> > losing new data >> >> > You could try setting --osd-use-stale-snap as suggested. >> >> Yes... tried... and I simply get rided of 39GB data... > What does "get rided" mean? according to this pastebin: https://pastebin.com/QPcpkjg4 ls -R /var/lib/ceph/osd/ceph-33/current/ /var/lib/ceph/osd/ceph-33/current/: commit_op_seq omap /var/lib/ceph/osd/ceph-33/current/omap: 000003.log CURRENT LOCK MANIFEST-000002 earlier there were same data files. >> >> > Is it the same error with the other one? >> >> Yes: https://pastebin.com/7XGAHdKH >> >> >> >> >> > in particular, osd 37 38 48 67 all have incomplete copies of the PG (they >> > are mid-backfill) and 68 has nothing. Some data is lost unless you can >> > recovery another OSD with that PG. >> >> > The set of OSDs that might have data are: 6,10,33,72,84 >> >> > If that bears no fruit, then you can force last_backfill to report >> complete on one of those OSDs and it'll think it has all the data even >> though some of it is likely gone. (We can pick one that is farther >> along... 38 48 and 67 seem to all match. Can You explain what do You mean by 'force last_backfill to report complete' ? The current value for PG 1.60 is MAX and for 1.165 is 1\/db616165\/rbd_data.ed9979641a9d82.000000000001dcee\/head -- Pozdrowienia, Łukasz Chrustek -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html