Re: xfs corruption

Ric Wheeler <rwheeler@xxxxxxxxxx> · Mon, 7 Mar 2016 13:34:48 +0530

You are right that some cards might not send those commands on to the backend 
storage, but spinning disks don't usually implement either trim or discard 
(SSD's do though).

XFS, ext4, etc can pass down those commands to the firmware on the card and it 
is up to the firmware to propagate the command on to the backend drives.

The file system layer itself does track allocation internally in its layer, so 
you will benefit from being able to reuse those blocks after a trim command 
(even without a raid card of any kind).

Regards,

Ric

On 03/07/2016 12:58 PM, Ferhat Ozkasgarli wrote:
Rick; you mean Raid 0 environment right?

If you use raid 5 or raid 10 or some other more complex raid configuration 
most of the physical disks' abilities vanishes. (trim, discard etc..)

Only handful of hardware raid cards able to pass trim and discard commands to 
physical disks if the raid configuration is raid 0 or raid 1.

On Mon, Mar 7, 2016 at 9:21 AM, Ric Wheeler <rwheeler@xxxxxxxxxx 
<mailto:rwheeler@xxxxxxxxxx>> wrote:

    It is perfectly reasonable and common to use hardware RAID cards in
    writeback mode under XFS (and under Ceph) if you configure them properly.

    The key thing is that for writeback cache enabled, you need to make sure
    that the S-ATA drives' write cache itself is disabled. Also make sure that
    your file system is mounted with "barrier" enabled.

    To check the backend write cache state on drives, you often need to use
    RAID card specific tools to query and set them.

    Regards,

    Ric

    On 02/27/2016 07:20 AM, fangchen sun wrote:

        Thank you for your response!

        All my hosts have raid cards. Some raid cards are in pass-throughput
        mode, and the others are in write-back mode. I will set all raid cards
        pass-throughput mode and observe for a period of time.

        Best Regards
        sunspot

        2016-02-25 20:07 GMT+08:00 Ferhat Ozkasgarli <ozkasgarli@xxxxxxxxx
        <mailto:ozkasgarli@xxxxxxxxx> <mailto:ozkasgarli@xxxxxxxxx
        <mailto:ozkasgarli@xxxxxxxxx>>>:

            This has happened me before but in virtual machine environment.

            The VM was KVM and storage was RBD. My problem was a bad cable in
        network.

            You should check following details:

            1-) Do you use any kind of hardware raid configuration? (Raid 0, 5
        or 10)

            Ceph does not work well on hardware raid systems. You should use raid
            cards in HBA (non-raid) mode and let raid card pass-throughput the
        disk.

            2-) Check your network connections

            It mas seem a obvious solution but  believe me network is one of
        the top
            rated culprit in Ceph environments.

            3-) If you are using SSD disk, make sure you use non-raid
        configuration.

            On Tue, Feb 23, 2016 at 10:55 PM, fangchen sun
        <sunspot0105@xxxxxxxxx <mailto:sunspot0105@xxxxxxxxx>
            <mailto:sunspot0105@xxxxxxxxx <mailto:sunspot0105@xxxxxxxxx>>> wrote:

                Dear all:

                I have a ceph object storage cluster with 143 osd and 7
        radosgw, and
                choose XFS as the underlying file system.
                I recently ran into a problem that sometimes a osd is marked
        down when
                the returned value of the function "chain_setxattr()" is -117.
        I only
                umount the disk and repair it with "xfs_repair".

                os: centos 6.5
                kernel version: 2.6.32

                the log for dmesg command:
                [41796028.532225] Pid: 1438740, comm: ceph-osd Not tainted
                2.6.32-925.431.23.3.letv.el6.x86_64 #1
                [41796028.532227] Call Trace:
                [41796028.532255] [<ffffffffa01e1e5f>] ?
        xfs_error_report+0x3f/0x50 [xfs]
                [41796028.532276] [<ffffffffa01d506a>] ?
        xfs_da_read_buf+0x2a/0x30 [xfs]
                [41796028.532296] [<ffffffffa01e1ece>] ?
                xfs_corruption_error+0x5e/0x90 [xfs]
                [41796028.532316] [<ffffffffa01d4f4c>] ?
        xfs_da_do_buf+0x6cc/0x770 [xfs]
                [41796028.532335] [<ffffffffa01d506a>] ?
        xfs_da_read_buf+0x2a/0x30 [xfs]
                [41796028.532359] [<ffffffffa0206fc7>] ?
        kmem_zone_alloc+0x77/0xf0 [xfs]
                [41796028.532380] [<ffffffffa01d506a>] ?
        xfs_da_read_buf+0x2a/0x30 [xfs]
                [41796028.532399] [<ffffffffa01bc481>] ?
                xfs_attr_leaf_addname+0x61/0x3d0 [xfs]
                [41796028.532426] [<ffffffffa01bc481>] ?
                xfs_attr_leaf_addname+0x61/0x3d0 [xfs]
                [41796028.532455] [<ffffffffa01ff187>] ?
        xfs_trans_add_item+0x57/0x70
                [xfs]
                [41796028.532476] [<ffffffffa01cc208>] ?
        xfs_bmbt_get_all+0x18/0x20 [xfs]
                [41796028.532495] [<ffffffffa01bcbb4>] ?
        xfs_attr_set_int+0x3c4/0x510
                [xfs]
                [41796028.532517] [<ffffffffa01d4f5b>] ?
        xfs_da_do_buf+0x6db/0x770 [xfs]
                [41796028.532536] [<ffffffffa01bcd81>] ?
        xfs_attr_set+0x81/0x90 [xfs]
                [41796028.532560] [<ffffffffa0216cc3>] ?
        __xfs_xattr_set+0x43/0x60 [xfs]
                [41796028.532584] [<ffffffffa0216d31>] ?
        xfs_xattr_user_set+0x11/0x20
                [xfs]
                [41796028.532592] [<ffffffff811aee92>] ?
        generic_setxattr+0xa2/0xb0
                [41796028.532596] [<ffffffff811b134e>] ?
        __vfs_setxattr_noperm+0x4e/0x160
                [41796028.532600] [<ffffffff81196b77>] ?
        inode_permission+0xa7/0x100
                [41796028.532604] [<ffffffff811b151c>] ? vfs_setxattr+0xbc/0xc0
                [41796028.532607] [<ffffffff811b15f0>] ? setxattr+0xd0/0x150
                [41796028.532612] [<ffffffff8105af80>] ?
        __dequeue_entity+0x30/0x50
                [41796028.532617] [<ffffffff8100988e>] ? __switch_to+0x26e/0x320
                [41796028.532621] [<ffffffff8118aec0>] ?
        __sb_start_write+0x80/0x120
                [41796028.532626] [<ffffffff8152912e>] ? thread_return+0x4e/0x760
                [41796028.532630] [<ffffffff811b171d>] ? sys_fsetxattr+0xad/0xd0
                [41796028.532633] [<ffffffff8100b072>] ?
        system_call_fastpath+0x16/0x1b
                [41796028.532636] XFS (sdi1): Corruption detected. Unmount and run
                xfs_repair

                Any comments will be much appreciated!

                Best Regards!
                sunspot

    _______________________________________________
    ceph-users mailing list
    ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com