Re: xfs corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





It is perfectly reasonable and common to use hardware RAID cards in writeback mode under XFS (and under Ceph) if you configure them properly.

The key thing is that for writeback cache enabled, you need to make sure that the S-ATA drives' write cache itself is disabled. Also make sure that your file system is mounted with "barrier" enabled.

To check the backend write cache state on drives, you often need to use RAID card specific tools to query and set them.

Regards,

Ric




On 02/27/2016 07:20 AM, fangchen sun wrote:

Thank you for your response!

All my hosts have raid cards. Some raid cards are in pass-throughput mode, and the others are in write-back mode. I will set all raid cards pass-throughput mode and observe for a period of time.


Best Regards
sunspot


2016-02-25 20:07 GMT+08:00 Ferhat Ozkasgarli <ozkasgarli@xxxxxxxxx <mailto:ozkasgarli@xxxxxxxxx>>:

    This has happened me before but in virtual machine environment.

    The VM was KVM and storage was RBD. My problem was a bad cable in network.

    You should check following details:

    1-) Do you use any kind of hardware raid configuration? (Raid 0, 5 or 10)

    Ceph does not work well on hardware raid systems. You should use raid
    cards in HBA (non-raid) mode and let raid card pass-throughput the disk.

    2-) Check your network connections

    It mas seem a obvious solution but  believe me network is one of the top
    rated culprit in Ceph environments.

    3-) If you are using SSD disk, make sure you use non-raid configuration.



    On Tue, Feb 23, 2016 at 10:55 PM, fangchen sun <sunspot0105@xxxxxxxxx
    <mailto:sunspot0105@xxxxxxxxx>> wrote:

        Dear all:

        I have a ceph object storage cluster with 143 osd and 7 radosgw, and
        choose XFS as the underlying file system.
        I recently ran into a problem that sometimes a osd is marked down when
        the returned value of the function "chain_setxattr()" is -117. I only
        umount the disk and repair it with "xfs_repair".

        os: centos 6.5
        kernel version: 2.6.32

        the log for dmesg command:
        [41796028.532225] Pid: 1438740, comm: ceph-osd Not tainted
        2.6.32-925.431.23.3.letv.el6.x86_64 #1
        [41796028.532227] Call Trace:
        [41796028.532255]  [<ffffffffa01e1e5f>] ? xfs_error_report+0x3f/0x50 [xfs]
        [41796028.532276]  [<ffffffffa01d506a>] ? xfs_da_read_buf+0x2a/0x30 [xfs]
        [41796028.532296]  [<ffffffffa01e1ece>] ?
        xfs_corruption_error+0x5e/0x90 [xfs]
        [41796028.532316]  [<ffffffffa01d4f4c>] ? xfs_da_do_buf+0x6cc/0x770 [xfs]
        [41796028.532335]  [<ffffffffa01d506a>] ? xfs_da_read_buf+0x2a/0x30 [xfs]
        [41796028.532359]  [<ffffffffa0206fc7>] ? kmem_zone_alloc+0x77/0xf0 [xfs]
        [41796028.532380]  [<ffffffffa01d506a>] ? xfs_da_read_buf+0x2a/0x30 [xfs]
        [41796028.532399]  [<ffffffffa01bc481>] ?
        xfs_attr_leaf_addname+0x61/0x3d0 [xfs]
        [41796028.532426]  [<ffffffffa01bc481>] ?
        xfs_attr_leaf_addname+0x61/0x3d0 [xfs]
        [41796028.532455]  [<ffffffffa01ff187>] ? xfs_trans_add_item+0x57/0x70
        [xfs]
        [41796028.532476]  [<ffffffffa01cc208>] ? xfs_bmbt_get_all+0x18/0x20 [xfs]
        [41796028.532495]  [<ffffffffa01bcbb4>] ? xfs_attr_set_int+0x3c4/0x510
        [xfs]
        [41796028.532517]  [<ffffffffa01d4f5b>] ? xfs_da_do_buf+0x6db/0x770 [xfs]
        [41796028.532536]  [<ffffffffa01bcd81>] ? xfs_attr_set+0x81/0x90 [xfs]
        [41796028.532560]  [<ffffffffa0216cc3>] ? __xfs_xattr_set+0x43/0x60 [xfs]
        [41796028.532584]  [<ffffffffa0216d31>] ? xfs_xattr_user_set+0x11/0x20
        [xfs]
        [41796028.532592]  [<ffffffff811aee92>] ? generic_setxattr+0xa2/0xb0
        [41796028.532596]  [<ffffffff811b134e>] ? __vfs_setxattr_noperm+0x4e/0x160
        [41796028.532600]  [<ffffffff81196b77>] ? inode_permission+0xa7/0x100
        [41796028.532604]  [<ffffffff811b151c>] ? vfs_setxattr+0xbc/0xc0
        [41796028.532607]  [<ffffffff811b15f0>] ? setxattr+0xd0/0x150
        [41796028.532612]  [<ffffffff8105af80>] ? __dequeue_entity+0x30/0x50
        [41796028.532617]  [<ffffffff8100988e>] ? __switch_to+0x26e/0x320
        [41796028.532621]  [<ffffffff8118aec0>] ? __sb_start_write+0x80/0x120
        [41796028.532626]  [<ffffffff8152912e>] ? thread_return+0x4e/0x760
        [41796028.532630]  [<ffffffff811b171d>] ? sys_fsetxattr+0xad/0xd0
        [41796028.532633]  [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
        [41796028.532636] XFS (sdi1): Corruption detected. Unmount and run
        xfs_repair

        Any comments will be much appreciated!

        Best Regards!
        sunspot



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux