That may have problem with your disk? Do you check the syslog or demsg log,? >From the code, it will return 'read_error' only the read return EIO. So i doubt that your disk have a sector error. Stuart Longland <stuartl@xxxxxxxxxxxxxxxxxx> 于2019年5月18日周六 上午9:43写道: > > On 18/5/19 11:34 am, huang jun wrote: > > Stuart Longland <stuartl@xxxxxxxxxxxxxxxxxx> 于2019年5月18日周六 上午9:26写道: > >> > >> On 16/5/19 8:55 pm, Stuart Longland wrote: > >>> As this is Bluestore, it's not clear what I should do to resolve that, > >>> so I thought I'd "RTFM" before asking here: > >>> http://docs.ceph.com/docs/luminous/rados/operations/pg-repair/ > >>> > >>> Maybe there's a secret hand-shake my web browser doesn't know about or > >>> maybe the page is written in invisible ink, but that page appears blank > >>> to me. > >> > >> Does anyone know why that page shows up blank? I still have a placement > >> group that is "inconsistent". (A different one this time, but still!) > >> > > That maybe something wrong in ceph.com, it's a blank page for me. > > Ahh okay, so I'm not going crazy … yet. :-) > > >> Some pages I've researched suggest going to the OSD's mount-point and > >> moving the offending object away, however Linux kernel 4.19.17 does not > >> have a 'bluestore' driver, so I can't mount the file system to get at > >> the offending object. > >> > >> Running `ceph pg repair <ID>` tells me it has "instructed" the OSD to do > >> a repair. The OSD shows nothing at all in its logs even acknowledging > >> the command, and the problem persists. The only log messages I have of > >> the issue are from yesterday: > >> > >>> 2019-05-17 05:59:53.170552 7f009b0be700 -1 log_channel(cluster) log [ERR] : 7.1a shard 3 soid 7:581d78de:::rbd_data.b48c7238e1f29.0000000000001b34:head : candidate had a read error > >>> 2019-05-17 07:07:20.723999 7f009b0be700 -1 log_channel(cluster) log [ERR] : 7.1a shard 3 soid 7:5b335293:::rbd_data.8c9e1238e1f29.0000000000001438:head : candidate had a read error > >>> 2019-05-17 07:29:16.537539 7f009b0be700 -1 log_channel(cluster) log [ERR] : 7.1a deep-scrub 0 missing, 2 inconsistent objects > >>> 2019-05-17 07:29:16.537557 7f009b0be700 -1 log_channel(cluster) log [ERR] : 7.1a deep-scrub 2 errors > >> > >> … not from just now when I issued the command. Why is my `ceph pg > >> repair` command being ignored? > > ceph pg repair will let pg do scrub and repair the inconsistent > > do you still see this warning messages after 'pg repair'? > > Yes, I've been running `ceph pg repair 7.1a` repeatedly for the past 4 > hours. No new log messages, and still `ceph health detail` shows this: > > > carbon ~ # ceph pg repair 7.1a > > instructing pg 7.1a on osd.2 to repair > > carbon ~ # ceph health detail > > HEALTH_ERR 2 scrub errors; Possible data damage: 1 pg inconsistent > > OSD_SCRUB_ERRORS 2 scrub errors > > PG_DAMAGED Possible data damage: 1 pg inconsistent > > pg 7.1a is active+clean+inconsistent, acting [2,3] > > I've also tried `ceph pg deep-scrub 7.1a` to no effect. > > I may shut the cluster down later to do some power infrastructure work > (need to add a new power distribution box to power two new nodes) and > possibly even install a new 48-port Ethernet switch but right now, I'd > like to try and get my storage cluster back to health. > > Regards, > -- > Stuart Longland (aka Redhatter, VK4MSL) > > I haven't lost my mind... > ...it's backed up on a tape somewhere. -- Thank you! HuangJun _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com