Re: xfs corruption

Ric Wheeler <rwheeler@xxxxxxxxxx> · Mon, 7 Mar 2016 13:51:30 +0530

Unfortunately, you will have to follow up with the hardware RAID card vendors to 
see what commands their firmware handles.

Good luck!

Ric

On 03/07/2016 01:37 PM, Ferhat Ozkasgarli wrote:
I am always forgetting this reply all things.
/
/
/RAID5 and RAID10 (or other raid levels) are a property of the block devices. 
XFS, ext4, etc can pass down those commands to the firmware on the card and it 
is up to the firmware to propagate the command on to the backend drives./

You mean I can get a hardware raid card that can pass discard and trim commend 
to disks with raid 10 array?

Can you please suggest me such a raid card?

Because we are in a verge of deciding on hardware raid or software raid to 
use. Because our OpenStack cluster uses full SSD storage (local raid 10) and 
my manager want to utilize hardware raid with SSD disks.

On Mon, Mar 7, 2016 at 10:04 AM, Ric Wheeler <rwheeler@xxxxxxxxxx 
<mailto:rwheeler@xxxxxxxxxx>> wrote:

    You are right that some cards might not send those commands on to the
    backend storage, but spinning disks don't usually implement either trim or
    discard (SSD's do though).

    XFS, ext4, etc can pass down those commands to the firmware on the card
    and it is up to the firmware to propagate the command on to the backend
    drives.

    The file system layer itself does track allocation internally in its
    layer, so you will benefit from being able to reuse those blocks after a
    trim command (even without a raid card of any kind).

    Regards,

    Ric

    On 03/07/2016 12:58 PM, Ferhat Ozkasgarli wrote:

        Rick; you mean Raid 0 environment right?

        If you use raid 5 or raid 10 or some other more complex raid
        configuration most of the physical disks' abilities vanishes. (trim,
        discard etc..)

        Only handful of hardware raid cards able to pass trim and discard
        commands to physical disks if the raid configuration is raid 0 or raid 1.

        On Mon, Mar 7, 2016 at 9:21 AM, Ric Wheeler <rwheeler@xxxxxxxxxx
        <mailto:rwheeler@xxxxxxxxxx> <mailto:rwheeler@xxxxxxxxxx
        <mailto:rwheeler@xxxxxxxxxx>>> wrote:

            It is perfectly reasonable and common to use hardware RAID cards in
            writeback mode under XFS (and under Ceph) if you configure them
        properly.

            The key thing is that for writeback cache enabled, you need to
        make sure
            that the S-ATA drives' write cache itself is disabled. Also make
        sure that
            your file system is mounted with "barrier" enabled.

            To check the backend write cache state on drives, you often need
        to use
            RAID card specific tools to query and set them.

            Regards,

            Ric

            On 02/27/2016 07:20 AM, fangchen sun wrote:

                Thank you for your response!

                All my hosts have raid cards. Some raid cards are in
        pass-throughput
                mode, and the others are in write-back mode. I will set all
        raid cards
                pass-throughput mode and observe for a period of time.

                Best Regards
                sunspot

                2016-02-25 20:07 GMT+08:00 Ferhat Ozkasgarli
        <ozkasgarli@xxxxxxxxx <mailto:ozkasgarli@xxxxxxxxx>
                <mailto:ozkasgarli@xxxxxxxxx <mailto:ozkasgarli@xxxxxxxxx>>
        <mailto:ozkasgarli@xxxxxxxxx <mailto:ozkasgarli@xxxxxxxxx>
                <mailto:ozkasgarli@xxxxxxxxx <mailto:ozkasgarli@xxxxxxxxx>>>>:

                    This has happened me before but in virtual machine
        environment.

                    The VM was KVM and storage was RBD. My problem was a bad
        cable in
                network.

                    You should check following details:

                    1-) Do you use any kind of hardware raid configuration?
        (Raid 0, 5
                or 10)

                    Ceph does not work well on hardware raid systems. You
        should use raid
                    cards in HBA (non-raid) mode and let raid card
        pass-throughput the
                disk.

                    2-) Check your network connections

                    It mas seem a obvious solution but  believe me network is
        one of
                the top
                    rated culprit in Ceph environments.

                    3-) If you are using SSD disk, make sure you use non-raid
                configuration.

                    On Tue, Feb 23, 2016 at 10:55 PM, fangchen sun
                <sunspot0105@xxxxxxxxx <mailto:sunspot0105@xxxxxxxxx>
        <mailto:sunspot0105@xxxxxxxxx <mailto:sunspot0105@xxxxxxxxx>>
                    <mailto:sunspot0105@xxxxxxxxx
        <mailto:sunspot0105@xxxxxxxxx> <mailto:sunspot0105@xxxxxxxxx
        <mailto:sunspot0105@xxxxxxxxx>>>> wrote:

                        Dear all:

                        I have a ceph object storage cluster with 143 osd and 7
                radosgw, and
                        choose XFS as the underlying file system.
                        I recently ran into a problem that sometimes a osd is
        marked
                down when
                        the returned value of the function "chain_setxattr()"
        is -117.
                I only
                        umount the disk and repair it with "xfs_repair".

                        os: centos 6.5
                        kernel version: 2.6.32

                        the log for dmesg command:
                        [41796028.532225] Pid: 1438740, comm: ceph-osd Not tainted
                        2.6.32-925.431.23.3.letv.el6.x86_64 #1
                        [41796028.532227] Call Trace:
                        [41796028.532255] [<ffffffffa01e1e5f>] ?
                xfs_error_report+0x3f/0x50 [xfs]
                        [41796028.532276] [<ffffffffa01d506a>] ?
                xfs_da_read_buf+0x2a/0x30 [xfs]
                        [41796028.532296] [<ffffffffa01e1ece>] ?
                        xfs_corruption_error+0x5e/0x90 [xfs]
                        [41796028.532316] [<ffffffffa01d4f4c>] ?
                xfs_da_do_buf+0x6cc/0x770 [xfs]
                        [41796028.532335] [<ffffffffa01d506a>] ?
                xfs_da_read_buf+0x2a/0x30 [xfs]
                        [41796028.532359] [<ffffffffa0206fc7>] ?
                kmem_zone_alloc+0x77/0xf0 [xfs]
                        [41796028.532380] [<ffffffffa01d506a>] ?
                xfs_da_read_buf+0x2a/0x30 [xfs]
                        [41796028.532399] [<ffffffffa01bc481>] ?
                        xfs_attr_leaf_addname+0x61/0x3d0 [xfs]
                        [41796028.532426] [<ffffffffa01bc481>] ?
                        xfs_attr_leaf_addname+0x61/0x3d0 [xfs]
                        [41796028.532455] [<ffffffffa01ff187>] ?
                xfs_trans_add_item+0x57/0x70
                        [xfs]
                        [41796028.532476] [<ffffffffa01cc208>] ?
                xfs_bmbt_get_all+0x18/0x20 [xfs]
                        [41796028.532495] [<ffffffffa01bcbb4>] ?
                xfs_attr_set_int+0x3c4/0x510
                        [xfs]
                        [41796028.532517] [<ffffffffa01d4f5b>] ?
                xfs_da_do_buf+0x6db/0x770 [xfs]
                        [41796028.532536] [<ffffffffa01bcd81>] ?
                xfs_attr_set+0x81/0x90 [xfs]
                        [41796028.532560] [<ffffffffa0216cc3>] ?
                __xfs_xattr_set+0x43/0x60 [xfs]
                        [41796028.532584] [<ffffffffa0216d31>] ?
                xfs_xattr_user_set+0x11/0x20
                        [xfs]
                        [41796028.532592] [<ffffffff811aee92>] ?
                generic_setxattr+0xa2/0xb0
                        [41796028.532596] [<ffffffff811b134e>] ?
                __vfs_setxattr_noperm+0x4e/0x160
                        [41796028.532600] [<ffffffff81196b77>] ?
                inode_permission+0xa7/0x100
                        [41796028.532604] [<ffffffff811b151c>] ?
        vfs_setxattr+0xbc/0xc0
                        [41796028.532607] [<ffffffff811b15f0>] ?
        setxattr+0xd0/0x150
                        [41796028.532612] [<ffffffff8105af80>] ?
                __dequeue_entity+0x30/0x50
                        [41796028.532617] [<ffffffff8100988e>] ?
        __switch_to+0x26e/0x320
                        [41796028.532621] [<ffffffff8118aec0>] ?
                __sb_start_write+0x80/0x120
                        [41796028.532626] [<ffffffff8152912e>] ?
        thread_return+0x4e/0x760
                        [41796028.532630] [<ffffffff811b171d>] ?
        sys_fsetxattr+0xad/0xd0
                        [41796028.532633] [<ffffffff8100b072>] ?
                system_call_fastpath+0x16/0x1b
                        [41796028.532636] XFS (sdi1): Corruption detected.
        Unmount and run
                        xfs_repair

                        Any comments will be much appreciated!

                        Best Regards!
                        sunspot

            _______________________________________________
            ceph-users mailing list
        ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
        <mailto:ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>>
        http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com