Re: Read errors on OSD

Oliver Humpage <oliver@xxxxxxxxxxxxxxx> · Thu, 1 Jun 2017 13:40:52 +0100

> On 1 Jun 2017, at 11:55, Matthew Vernon <mv3@xxxxxxxxxxxx> wrote:
> 
> You don't say what's in kern.log - we've had (rotating) disks that were throwing read errors but still saying they were OK on SMART.

Fair point. There was nothing correlating to the time that ceph logged an error this morning, which is why I didn’t mention it, but looking harder I see yesterday there was a

May 31 07:20:13 osd1 kernel: sd 0:0:8:0: [sdi] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
May 31 07:20:13 osd1 kernel: sd 0:0:8:0: [sdi] tag#0 Sense Key : Hardware Error [current] 
May 31 07:20:13 osd1 kernel: sd 0:0:8:0: [sdi] tag#0 Add. Sense: Internal target failure
May 31 07:20:13 osd1 kernel: sd 0:0:8:0: [sdi] tag#0 CDB: Read(10) 28 00 77 51 42 d8 00 02 00 00
May 31 07:20:13 osd1 kernel: blk_update_request: critical target error, dev sdi, sector 2001814232

sdi was the disk with the OSD affected today. Guess it’s flakey SSDs then. 

Weird that just re-reading the file makes everything OK though - wondering how much it’s worth worrying about that, or if there’s a way of making ceph retry reads automatically?

Oliver.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com