I’m sorry for my late reply. Thank you for your reply. Yes, this error only exists while backend is xfs. Ext4&bluestore will not trigger the error. > 在 2018年3月12日,下午6:31,Peter Woodman <peter@xxxxxxxxxxxx> 写道: > > from what i've heard, xfs has problems on arm. use btrfs, or (i > believe?) ext4+bluestore will work. > > On Sun, Mar 11, 2018 at 9:49 PM, Christian Wuerdig > <christian.wuerdig@xxxxxxxxx> wrote: >> Hm, so you're running OSD nodes with 2GB of RAM and 2x10TB = 20TB of >> storage? Literally everything posted on this list in relation to HW >> requirements and related problems will tell you that this simply isn't going >> to work. The slightest hint of a problem will simply kill the OSD nodes with >> OOM. Have you tried with smaller disks - like 1TB models (or even smaller >> like 256GB SSDs) and see if the same problem persists? >> >> >> On Tue, 6 Mar 2018 at 10:51, 赵赵贺东 <zhaohedong@xxxxxxxxx> wrote: >>> >>> Hello ceph-users, >>> >>> It is a really really Really tough problem for our team. >>> We investigated in the problem for a long time, try a lot of efforts, but >>> can’t solve the problem, even the concentrate cause of the problem is still >>> unclear for us! >>> So, Anyone give any solution/suggestion/opinion whatever will be highly >>> highly appreciated!!! >>> >>> Problem Summary: >>> When we activate osd, there will be metadata corrupttion in the >>> activating disk, probability is 100% ! >>> >>> Admin Nodes&MON node: >>> Platform: X86 >>> OS: Ubuntu 16.04 >>> Kernel: 4.12.0 >>> Ceph: Luminous 12.2.2 >>> >>> OSD nodes: >>> Platform: armv7 >>> OS: Ubuntu 14.04 >>> Kernel: 4.4.39 >>> Ceph: Lominous 12.2.2 >>> Disk: 10T+10T >>> Memory: 2GB >>> >>> Deploy log: >>> >>> >>> dmesg log:(Sorry arms001-01 dmesg log has log has been lost, but error >>> message about metadata corruption on arms003-10 are the same with >>> arms001-01) >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.534232] XFS (sda1): Unmount and >>> run xfs_repair >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.539100] XFS (sda1): First 64 >>> bytes of corrupted metadata buffer: >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.545504] eb82f000: 58 46 53 42 00 >>> 00 10 00 00 00 00 00 91 73 fe fb XFSB.........s.. >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.553569] eb82f010: 00 00 00 00 00 >>> 00 00 00 00 00 00 00 00 00 00 00 ................ >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.561624] eb82f020: fc 4e e3 89 50 >>> 8f 42 aa be bc 07 0c 6e fa 83 2f .N..P.B.....n../ >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.569706] eb82f030: 00 00 00 00 80 >>> 00 00 07 ff ff ff ff ff ff ff ff ................ >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.577778] XFS (sda1): metadata I/O >>> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8 >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.602944] XFS (sda1): Metadata >>> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data >>> block 0x48b9ff80 >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.614170] XFS (sda1): Unmount and >>> run xfs_repair >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.619030] XFS (sda1): First 64 >>> bytes of corrupted metadata buffer: >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.625403] eb901000: 58 46 53 42 00 >>> 00 10 00 00 00 00 00 91 73 fe fb XFSB.........s.. >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.633441] eb901010: 00 00 00 00 00 >>> 00 00 00 00 00 00 00 00 00 00 00 ................ >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.641474] eb901020: fc 4e e3 89 50 >>> 8f 42 aa be bc 07 0c 6e fa 83 2f .N..P.B.....n../ >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.649519] eb901030: 00 00 00 00 80 >>> 00 00 07 ff ff ff ff ff ff ff ff ................ >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.657554] XFS (sda1): metadata I/O >>> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8 >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.675056] XFS (sda1): Metadata >>> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data >>> block 0x48b9ff80 >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.686228] XFS (sda1): Unmount and >>> run xfs_repair >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.691054] XFS (sda1): First 64 >>> bytes of corrupted metadata buffer: >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.697425] eb901000: 58 46 53 42 00 >>> 00 10 00 00 00 00 00 91 73 fe fb XFSB.........s.. >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.705459] eb901010: 00 00 00 00 00 >>> 00 00 00 00 00 00 00 00 00 00 00 ................ >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.713489] eb901020: fc 4e e3 89 50 >>> 8f 42 aa be bc 07 0c 6e fa 83 2f .N..P.B.....n../ >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.721520] eb901030: 00 00 00 00 80 >>> 00 00 07 ff ff ff ff ff ff ff ff ................ >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.729558] XFS (sda1): metadata I/O >>> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8 >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.741953] XFS (sda1): Metadata >>> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data >>> block 0x48b9ff80 >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.753139] XFS (sda1): Unmount and >>> run xfs_repair >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.757955] XFS (sda1): First 64 >>> bytes of corrupted metadata buffer: >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.764336] eb901000: 58 46 53 42 00 >>> 00 10 00 00 00 00 00 91 73 fe fb XFSB.........s.. >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.772365] eb901010: 00 00 00 00 00 >>> 00 00 00 00 00 00 00 00 00 00 00 ................ >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.780395] eb901020: fc 4e e3 89 50 >>> 8f 42 aa be bc 07 0c 6e fa 83 2f .N..P.B.....n../ >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.788417] eb901030: 00 00 00 00 80 >>> 00 00 07 ff ff ff ff ff ff ff ff ................ >>> Mar 5 11:08:49 arms003-10 kernel: [ 252.796514] XFS (sda1): metadata I/O >>> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8 >>> >>> Our tries for solving the problem: >>> 1.Delploy osd manually, still got the same error has been confirmed. >>> 2.Browse kernel bug fix log, but no related bug fix log has been found >>> since kernel 4.4.39. >>> 3.Upgrade xfsprogs from 3.1.9 to 4.15.0, error number changed, still but >>> disk will be corrupted while activating osd! >>> >>> [2912641.987937] XFS (sda1): Metadata CRC error detected at >>> xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data block 0xfffffff0 >>> [2912641.999203] XFS (sda1): Unmount and run xfs_repair >>> [2912642.004202] XFS (sda1): First 64 bytes of corrupted metadata buffer: >>> [2912642.010759] e689a000: 58 46 53 42 00 00 10 00 00 00 00 00 91 73 fe fb >>> XFSB.........s.. >>> [2912642.018958] e689a010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> ................ >>> [2912642.027177] e689a020: 61 7b 64 0d fa fe 41 14 bf ea 90 32 6c 73 e5 ad >>> a{d...A....2ls.. >>> [2912642.035388] e689a030: 00 00 00 00 50 00 00 08 ff ff ff ff ff ff ff ff >>> ....P........... >>> [2912642.043630] XFS (sda1): metadata I/O error: block 0xfffffff0 >>> ("xfs_trans_read_buf_map") error 74 numblks 8 >>> [2912642.060390] XFS (sda1): Metadata CRC error detected at >>> xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data block 0xfffffff0 >>> [2912642.071673] XFS (sda1): Unmount and run xfs_repair >>> >>> 4.Use the disk as OSD node on X86 will not trigger the the problem has >>> been confirmed. >>> 5.Use sgdisk & mkfs.xfs to format the disk, and mount do some read&write >>> dd test then unmount, will not trigger the problem has been confirmed. >>> 6.Chang ceph version form 12.2.2 to 10.2.10, the problem still exist has >>> been confirmed. >>> 10.2.0 Deploy log >>> >>> Corruption error is the same as 12.2.2. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com