Repairing PG inconsistencies — Ceph Documentation - where's the text?

Stuart Longland <stuartl@xxxxxxxxxxxxxxxxxx> · Thu, 16 May 2019 20:55:35 +1000

Hi all,

I've got a placement group on a cluster that just refuses to clear
itself up.  Long story short, one of my storage nodes (combined OSD+MON
with a single OSD disk) in my 3-node storage cluster keeled over, and in
the short term, I'm running its OSD in a USB HDD dock on one of the
remaining nodes.  (I have replacement hardware coming.)

This evening, it seems the OSD daemon looking after that disk hiccupped,
and went into zombie mode, the only way I could get that OSD working
again was to reboot the host.  After it came back up, I had 4 placement
groups "damaged", Ceph has managed to clean up 3 of them, but one
remains stubbornly stuck (on a disk *not* connected via USB):

> 2019-05-16 20:44:16.608770 7f4326ff0700 -1 bluestore(/var/lib/ceph/osd/ceph-1) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x6706be76, expected 0x6ee89d7d, device location [0x1ac1560000~1000], logical extent 0x0~1000, object #7:0c2fe490:::rbd_data.b48c12ae8944a.0000000000000faa:head#

As this is Bluestore, it's not clear what I should do to resolve that,
so I thought I'd "RTFM" before asking here:
http://docs.ceph.com/docs/luminous/rados/operations/pg-repair/

Maybe there's a secret hand-shake my web browser doesn't know about or
maybe the page is written in invisible ink, but that page appears blank
to me.
-- 
Stuart Longland (aka Redhatter, VK4MSL)

I haven't lost my mind...
  ...it's backed up on a tape somewhere.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com