Hi, Regarding two recent messages I sent to the list [1][2], yesterday it happened again: there were slow requests in one OSD, until it committed suicide, and many virtual machines crashed with disk errors. I'm attaching the log for the OSD that failed. The logs of the virtual machines don't show anything. There were on-screen messages informing that they were remounting read-only. I'll transcribe the screen of one of those: ata1.01: failed command: FLUSH CACHE ata1.01: cmd e7/00:00:00:00:00/00:00:00:00:00/b0 tag 0 res 40/00:01:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout) ata1.01: status: { DRDY } end_request: I/O error, dev sda, sector 1596912 end_request: I/O error, dev sda, sector 1596912 Buffer I/O error on device sda1, logical block 199358 . . Aborting journal on device sda1. EXT3-fs (sda1): error: ext3_journal_start_sb: Detected aborted journal EXT3-fs (sda1): error: remounting filesystem read-only Buffer I/O error on device sda1 JBD: I/O error detected when updating journal superblock for sda1. sd 0:0:1:0: [sda] Asking for cache data failed sd 0:0:1:0: [sda] Assuming drive cache: write through [1]http://article.gmane.org/gmane.comp.file-systems.ceph.user/15642/ [2]http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-November/044887.html Thanks, Paulo
Attachment:
ceph-osd.18.log.1.gz
Description: GNU Zip compressed data
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com