IO stalls when primary OSD device blocks in 17.2.6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear cephers,

we are sometimes observing stalling IO on our ceph 17.2.6 cluster when the backing device for the primary OSD of a PG fails and seems to block read IO to objects from that pg. If I set the OSD with the broken device to down, the IO continues. Setting the OSD to down is not sufficient.

The cluster is running on Debian 11, the pool is an erasure coded cephfs data pool. The OSD has a HDD data device and an SSD db device. The data devices is the one which failed and was blocking IO.

The OSD was reporting slow ops and short time after that smartd notified about unreadable sectors.

Has anyone seen such behaviour? Are there some tweaks that I missed?

Kind regards,

Daniel
--
Daniel Schreiber
Facharbeitsgruppe Systemsoftware
Universitaetsrechenzentrum

Technische Universität Chemnitz
Straße der Nationen 62 (Raum B303)
09111 Chemnitz
Germany

Tel:     +49 371 531 35444
Fax:     +49 371 531 835444

Attachment: smime.p7s
Description: Kryptografische S/MIME-Signatur

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux