This has happened me before but in virtual machine environment.
The VM was KVM and storage was RBD. My problem was a bad cable in network.
You should check following details:
1-) Do you use any kind of hardware raid configuration? (Raid 0, 5 or 10)
Ceph does not work well on hardware raid systems. You should use raid cards in HBA (non-raid) mode and let raid card pass-throughput the disk.
2-) Check your network connections
It mas seem a obvious solution but believe me network is one of the top rated culprit in Ceph environments.
3-) If you are using SSD disk, make sure you use non-raid configuration.
On Tue, Feb 23, 2016 at 10:55 PM, fangchen sun <sunspot0105@xxxxxxxxx> wrote:
Dear all:I have a ceph object storage cluster with 143 osd and 7 radosgw, and choose XFS as the underlying file system.I recently ran into a problem that sometimes a osd is marked down when the returned value of the function "chain_setxattr()" is -117. I only umount the disk and repair it with "xfs_repair".os: centos 6.5kernel version: 2.6.32the log for dmesg command:[41796028.532225] Pid: 1438740, comm: ceph-osd Not tainted 2.6.32-925.431.23.3.letv.el6.x86_64 #1[41796028.532227] Call Trace:[41796028.532255] [<ffffffffa01e1e5f>] ? xfs_error_report+0x3f/0x50 [xfs][41796028.532276] [<ffffffffa01d506a>] ? xfs_da_read_buf+0x2a/0x30 [xfs][41796028.532296] [<ffffffffa01e1ece>] ? xfs_corruption_error+0x5e/0x90 [xfs][41796028.532316] [<ffffffffa01d4f4c>] ? xfs_da_do_buf+0x6cc/0x770 [xfs][41796028.532335] [<ffffffffa01d506a>] ? xfs_da_read_buf+0x2a/0x30 [xfs][41796028.532359] [<ffffffffa0206fc7>] ? kmem_zone_alloc+0x77/0xf0 [xfs][41796028.532380] [<ffffffffa01d506a>] ? xfs_da_read_buf+0x2a/0x30 [xfs][41796028.532399] [<ffffffffa01bc481>] ? xfs_attr_leaf_addname+0x61/0x3d0 [xfs][41796028.532426] [<ffffffffa01bc481>] ? xfs_attr_leaf_addname+0x61/0x3d0 [xfs][41796028.532455] [<ffffffffa01ff187>] ? xfs_trans_add_item+0x57/0x70 [xfs][41796028.532476] [<ffffffffa01cc208>] ? xfs_bmbt_get_all+0x18/0x20 [xfs][41796028.532495] [<ffffffffa01bcbb4>] ? xfs_attr_set_int+0x3c4/0x510 [xfs][41796028.532517] [<ffffffffa01d4f5b>] ? xfs_da_do_buf+0x6db/0x770 [xfs][41796028.532536] [<ffffffffa01bcd81>] ? xfs_attr_set+0x81/0x90 [xfs][41796028.532560] [<ffffffffa0216cc3>] ? __xfs_xattr_set+0x43/0x60 [xfs][41796028.532584] [<ffffffffa0216d31>] ? xfs_xattr_user_set+0x11/0x20 [xfs][41796028.532592] [<ffffffff811aee92>] ? generic_setxattr+0xa2/0xb0[41796028.532596] [<ffffffff811b134e>] ? __vfs_setxattr_noperm+0x4e/0x160[41796028.532600] [<ffffffff81196b77>] ? inode_permission+0xa7/0x100[41796028.532604] [<ffffffff811b151c>] ? vfs_setxattr+0xbc/0xc0[41796028.532607] [<ffffffff811b15f0>] ? setxattr+0xd0/0x150[41796028.532612] [<ffffffff8105af80>] ? __dequeue_entity+0x30/0x50[41796028.532617] [<ffffffff8100988e>] ? __switch_to+0x26e/0x320[41796028.532621] [<ffffffff8118aec0>] ? __sb_start_write+0x80/0x120[41796028.532626] [<ffffffff8152912e>] ? thread_return+0x4e/0x760[41796028.532630] [<ffffffff811b171d>] ? sys_fsetxattr+0xad/0xd0[41796028.532633] [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b[41796028.532636] XFS (sdi1): Corruption detected. Unmount and run xfs_repairAny comments will be much appreciated!Best Regards!sunspot
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com