Hi guys,
Quick question. I have a VM with some SCSI drives which act as the OSDs in my test lab. I have removed the SCSI drive so it's totally gone from the system, syslog is dropping I/O errors but the cluster still looks healthy.
Can you tell me why ? I'm trying to reproduce the problem if the real drive would have failed.
# ll /dev/sd*
brw-rw---- 1 root disk 8, 0 Feb 19 11:13 /dev/sda
brw-rw---- 1 root disk 8, 1 Feb 17 16:45 /dev/sda1
brw-rw---- 1 root disk 8, 2 Feb 17 16:45 /dev/sda2
brw-rw---- 1 root disk 8, 5 Feb 17 16:45 /dev/sda5
brw-rw---- 1 root disk 8, 32 Feb 19 11:13 /dev/sdc
brw-rw---- 1 root disk 8, 33 Feb 17 16:45 /dev/sdc1
brw-rw---- 1 root disk 8, 34 Feb 19 11:11 /dev/sdc2
brw-rw---- 1 root disk 8, 48 Feb 19 11:13 /dev/sdd
brw-rw---- 1 root disk 8, 49 Feb 17 16:45 /dev/sdd1
brw-rw---- 1 root disk 8, 50 Feb 19 11:05 /dev/sdd2
Feb 19 11:06:02 ceph-test-vosd-03 kernel: [586497.813485] sd 2:0:1:0: [sdb] Synchronizing SCSI cache
Feb 19 11:06:13 ceph-test-vosd-03 kernel: [586508.197668] XFS (sdb1): metadata I/O error: block 0x39e116d3 ("xlog_iodone") error 19 numblks 64
Feb 19 11:06:13 ceph-test-vosd-03 kernel: [586508.197815] XFS (sdb1): xfs_do_force_shutdown(0x2) called from line 1115 of file /build/buildd/linux-lts-saucy-3.11.0/fs/xfs/xfs_log.c. Return address = 0xffffffffa01e1fe1
Feb 19 11:06:13 ceph-test-vosd-03 kernel: [586508.197823] XFS (sdb1): Log I/O Error Detected. Shutting down filesystem
Feb 19 11:06:13 ceph-test-vosd-03 kernel: [586508.197880] XFS (sdb1): Please umount the filesystem and rectify the problem(s)
Feb 19 11:06:43 ceph-test-vosd-03 kernel: [586538.306817] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:07:13 ceph-test-vosd-03 kernel: [586568.415986] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:07:43 ceph-test-vosd-03 kernel: [586598.525178] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:08:13 ceph-test-vosd-03 kernel: [586628.634356] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:08:43 ceph-test-vosd-03 kernel: [586658.743533] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:09:13 ceph-test-vosd-03 kernel: [586688.852714] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:09:43 ceph-test-vosd-03 kernel: [586718.961903] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:10:13 ceph-test-vosd-03 kernel: [586749.071076] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:10:43 ceph-test-vosd-03 kernel: [586779.180263] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:11:13 ceph-test-vosd-03 kernel: [586809.289440] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:11:44 ceph-test-vosd-03 kernel: [586839.398626] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:12:14 ceph-test-vosd-03 kernel: [586869.507804] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:12:44 ceph-test-vosd-03 kernel: [586899.616988] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:12:52 ceph-test-vosd-03 kernel: [586907.848993] end_request: I/O error, dev fd0, sector 0
Quick question. I have a VM with some SCSI drives which act as the OSDs in my test lab. I have removed the SCSI drive so it's totally gone from the system, syslog is dropping I/O errors but the cluster still looks healthy.
Can you tell me why ? I'm trying to reproduce the problem if the real drive would have failed.
# ll /dev/sd*
brw-rw---- 1 root disk 8, 0 Feb 19 11:13 /dev/sda
brw-rw---- 1 root disk 8, 1 Feb 17 16:45 /dev/sda1
brw-rw---- 1 root disk 8, 2 Feb 17 16:45 /dev/sda2
brw-rw---- 1 root disk 8, 5 Feb 17 16:45 /dev/sda5
brw-rw---- 1 root disk 8, 32 Feb 19 11:13 /dev/sdc
brw-rw---- 1 root disk 8, 33 Feb 17 16:45 /dev/sdc1
brw-rw---- 1 root disk 8, 34 Feb 19 11:11 /dev/sdc2
brw-rw---- 1 root disk 8, 48 Feb 19 11:13 /dev/sdd
brw-rw---- 1 root disk 8, 49 Feb 17 16:45 /dev/sdd1
brw-rw---- 1 root disk 8, 50 Feb 19 11:05 /dev/sdd2
Feb 19 11:06:02 ceph-test-vosd-03 kernel: [586497.813485] sd 2:0:1:0: [sdb] Synchronizing SCSI cache
Feb 19 11:06:13 ceph-test-vosd-03 kernel: [586508.197668] XFS (sdb1): metadata I/O error: block 0x39e116d3 ("xlog_iodone") error 19 numblks 64
Feb 19 11:06:13 ceph-test-vosd-03 kernel: [586508.197815] XFS (sdb1): xfs_do_force_shutdown(0x2) called from line 1115 of file /build/buildd/linux-lts-saucy-3.11.0/fs/xfs/xfs_log.c. Return address = 0xffffffffa01e1fe1
Feb 19 11:06:13 ceph-test-vosd-03 kernel: [586508.197823] XFS (sdb1): Log I/O Error Detected. Shutting down filesystem
Feb 19 11:06:13 ceph-test-vosd-03 kernel: [586508.197880] XFS (sdb1): Please umount the filesystem and rectify the problem(s)
Feb 19 11:06:43 ceph-test-vosd-03 kernel: [586538.306817] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:07:13 ceph-test-vosd-03 kernel: [586568.415986] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:07:43 ceph-test-vosd-03 kernel: [586598.525178] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:08:13 ceph-test-vosd-03 kernel: [586628.634356] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:08:43 ceph-test-vosd-03 kernel: [586658.743533] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:09:13 ceph-test-vosd-03 kernel: [586688.852714] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:09:43 ceph-test-vosd-03 kernel: [586718.961903] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:10:13 ceph-test-vosd-03 kernel: [586749.071076] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:10:43 ceph-test-vosd-03 kernel: [586779.180263] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:11:13 ceph-test-vosd-03 kernel: [586809.289440] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:11:44 ceph-test-vosd-03 kernel: [586839.398626] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:12:14 ceph-test-vosd-03 kernel: [586869.507804] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:12:44 ceph-test-vosd-03 kernel: [586899.616988] XFS (sdb1): xfs_log_force: error 5 returned.
Feb 19 11:12:52 ceph-test-vosd-03 kernel: [586907.848993] end_request: I/O error, dev fd0, sector 0
mount:
/dev/sdb1 on /var/lib/ceph/osd/ceph-6 type xfs (rw,noatime)
/dev/sdc1 on /var/lib/ceph/osd/ceph-7 type xfs (rw,noatime)
/dev/sdd1 on /var/lib/ceph/osd/ceph-8 type xfs (rw,noatime)
ll /var/lib/ceph/osd/ceph-6
ls: cannot access /var/lib/ceph/osd/ceph-6: Input/output error
/dev/sdb1 on /var/lib/ceph/osd/ceph-6 type xfs (rw,noatime)
/dev/sdc1 on /var/lib/ceph/osd/ceph-7 type xfs (rw,noatime)
/dev/sdd1 on /var/lib/ceph/osd/ceph-8 type xfs (rw,noatime)
ll /var/lib/ceph/osd/ceph-6
ls: cannot access /var/lib/ceph/osd/ceph-6: Input/output error
-4 2.7 host ceph-test-vosd-03
6 0.9 osd.6 up 1
7 0.9 osd.7 up 1
8 0.9 osd.8 up 1
# ceph-disk list
/dev/fd0 other, unknown
/dev/sda :
/dev/sda1 other, ext2
/dev/sda2 other
/dev/sda5 other, LVM2_member
/dev/sdc :
/dev/sdc1 ceph data, active, cluster ceph, osd.7, journal /dev/sdc2
/dev/sdc2 ceph journal, for /dev/sdc1
/dev/sdd :
/dev/sdd1 ceph data, active, cluster ceph, osd.8, journal /dev/sdd2
/dev/sdd2 ceph journal, for /dev/sdd1
cluster 1a588c94-6f5e-4b04-bc07-f5ce99b91a35
health HEALTH_OK
monmap e7: 3 mons at {ceph-test-mon-01=172.17.12.11:6789/0,ceph-test-mon-02=172.17.12.12:6789/0,ceph-test-mon-03=172.17.12.13:6789/0}, election epoch 50, quorum 0,1,2 ceph-test-mon-01,ceph-test-mon-02,ceph-test-mon-03
mdsmap e4: 1/1/1 up {0=ceph-test-admin=up:active}
osdmap e124: 9 osds: 9 up, 9 in
pgmap v1812: 256 pgs, 13 pools, 1522 MB data, 469 objects
3379 MB used, 8326 GB / 8329 GB avail
256 active+clean
Regards.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com