Re: OSD_TOO_MANY_REPAIRS on random OSDs causing clients to hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

There are no kernel log messages that indicate read errors on the disk, and the error is not tied to one specific OSD. The errors so far have been on 7 different OSDs and when we restart the OSD with errors, the errors appears on one of the other OSDs in the same PG; as you can see when restarting osd.34, the errors continue to appear on osd.284 which have the same PG

HEALTH_WARN Too many repaired reads on 2 OSDs; 1 slow ops, oldest one blocked for 9138 sec, osd.284 has slow ops
[WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 2 OSDs
    osd.284 had 172635 reads repaired
    osd.34 had 26907 reads repaired
[WRN] SLOW_OPS: 1 slow ops, oldest one blocked for 9138 sec, osd.284 has slow ops

Also, the curious thing is that it only occurs in pool id 42...


Only error that we saw on the node that we replaced motherboard on:
[Sat Mar 25 21:49:31 2023] mce: [Hardware Error]: Machine check events logged
[Tue Mar 28 20:00:28 2023] mce: [Hardware Error]: Machine check events logged
[Wed Apr 19 01:50:41 2023] mce: [Hardware Error]: Machine check events logged

mce: [Hardware Error] suggest memory or other type of hardware error as we understand it.


--thomas



> 26. apr. 2023 kl. 13:55 skrev Robert Sander <r.sander@xxxxxxxxxxxxxxxxxxx>:
> 
> On 26.04.23 13:24, Thomas Hukkelberg wrote:
> 
>> [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 1 OSDs
>>     osd.34 had 9936 reads repaired
> 
> Are there any messages in the kernel log that indicate this device has read errors? Have you considered replacing the disk?
> 
> Regards
> -- 
> Robert Sander
> Heinlein Consulting GmbH
> Schwedter Str. 8/9b, 10119 Berlin
> 
> https://www.heinlein-support.de
> 
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
> 
> Amtsgericht Berlin-Charlottenburg - HRB 220009 B
> Geschäftsführer: Peer Heinlein - Sitz: Berlin
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux