Hi all, I've got a placement group on a cluster that just refuses to clear itself up. Long story short, one of my storage nodes (combined OSD+MON with a single OSD disk) in my 3-node storage cluster keeled over, and in the short term, I'm running its OSD in a USB HDD dock on one of the remaining nodes. (I have replacement hardware coming.) This evening, it seems the OSD daemon looking after that disk hiccupped, and went into zombie mode, the only way I could get that OSD working again was to reboot the host. After it came back up, I had 4 placement groups "damaged", Ceph has managed to clean up 3 of them, but one remains stubbornly stuck (on a disk *not* connected via USB): > 2019-05-16 20:44:16.608770 7f4326ff0700 -1 bluestore(/var/lib/ceph/osd/ceph-1) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x6706be76, expected 0x6ee89d7d, device location [0x1ac1560000~1000], logical extent 0x0~1000, object #7:0c2fe490:::rbd_data.b48c12ae8944a.0000000000000faa:head# As this is Bluestore, it's not clear what I should do to resolve that, so I thought I'd "RTFM" before asking here: http://docs.ceph.com/docs/luminous/rados/operations/pg-repair/ Maybe there's a secret hand-shake my web browser doesn't know about or maybe the page is written in invisible ink, but that page appears blank to me. -- Stuart Longland (aka Redhatter, VK4MSL) I haven't lost my mind... ...it's backed up on a tape somewhere. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com