On 18/5/19 11:34 am, huang jun wrote: > Stuart Longland <stuartl@xxxxxxxxxxxxxxxxxx> 于2019年5月18日周六 上午9:26写道: >> >> On 16/5/19 8:55 pm, Stuart Longland wrote: >>> As this is Bluestore, it's not clear what I should do to resolve that, >>> so I thought I'd "RTFM" before asking here: >>> http://docs.ceph.com/docs/luminous/rados/operations/pg-repair/ >>> >>> Maybe there's a secret hand-shake my web browser doesn't know about or >>> maybe the page is written in invisible ink, but that page appears blank >>> to me. >> >> Does anyone know why that page shows up blank? I still have a placement >> group that is "inconsistent". (A different one this time, but still!) >> > That maybe something wrong in ceph.com, it's a blank page for me. Ahh okay, so I'm not going crazy … yet. :-) >> Some pages I've researched suggest going to the OSD's mount-point and >> moving the offending object away, however Linux kernel 4.19.17 does not >> have a 'bluestore' driver, so I can't mount the file system to get at >> the offending object. >> >> Running `ceph pg repair <ID>` tells me it has "instructed" the OSD to do >> a repair. The OSD shows nothing at all in its logs even acknowledging >> the command, and the problem persists. The only log messages I have of >> the issue are from yesterday: >> >>> 2019-05-17 05:59:53.170552 7f009b0be700 -1 log_channel(cluster) log [ERR] : 7.1a shard 3 soid 7:581d78de:::rbd_data.b48c7238e1f29.0000000000001b34:head : candidate had a read error >>> 2019-05-17 07:07:20.723999 7f009b0be700 -1 log_channel(cluster) log [ERR] : 7.1a shard 3 soid 7:5b335293:::rbd_data.8c9e1238e1f29.0000000000001438:head : candidate had a read error >>> 2019-05-17 07:29:16.537539 7f009b0be700 -1 log_channel(cluster) log [ERR] : 7.1a deep-scrub 0 missing, 2 inconsistent objects >>> 2019-05-17 07:29:16.537557 7f009b0be700 -1 log_channel(cluster) log [ERR] : 7.1a deep-scrub 2 errors >> >> … not from just now when I issued the command. Why is my `ceph pg >> repair` command being ignored? > ceph pg repair will let pg do scrub and repair the inconsistent > do you still see this warning messages after 'pg repair'? Yes, I've been running `ceph pg repair 7.1a` repeatedly for the past 4 hours. No new log messages, and still `ceph health detail` shows this: > carbon ~ # ceph pg repair 7.1a > instructing pg 7.1a on osd.2 to repair > carbon ~ # ceph health detail > HEALTH_ERR 2 scrub errors; Possible data damage: 1 pg inconsistent > OSD_SCRUB_ERRORS 2 scrub errors > PG_DAMAGED Possible data damage: 1 pg inconsistent > pg 7.1a is active+clean+inconsistent, acting [2,3] I've also tried `ceph pg deep-scrub 7.1a` to no effect. I may shut the cluster down later to do some power infrastructure work (need to add a new power distribution box to power two new nodes) and possibly even install a new 48-port Ethernet switch but right now, I'd like to try and get my storage cluster back to health. Regards, -- Stuart Longland (aka Redhatter, VK4MSL) I haven't lost my mind... ...it's backed up on a tape somewhere. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com