from what i've heard, xfs has problems on arm. use btrfs, or (i believe?) ext4+bluestore will work. On Sun, Mar 11, 2018 at 9:49 PM, Christian Wuerdig <christian.wuerdig@xxxxxxxxx> wrote: > Hm, so you're running OSD nodes with 2GB of RAM and 2x10TB = 20TB of > storage? Literally everything posted on this list in relation to HW > requirements and related problems will tell you that this simply isn't going > to work. The slightest hint of a problem will simply kill the OSD nodes with > OOM. Have you tried with smaller disks - like 1TB models (or even smaller > like 256GB SSDs) and see if the same problem persists? > > > On Tue, 6 Mar 2018 at 10:51, 赵赵贺东 <zhaohedong@xxxxxxxxx> wrote: >> >> Hello ceph-users, >> >> It is a really really Really tough problem for our team. >> We investigated in the problem for a long time, try a lot of efforts, but >> can’t solve the problem, even the concentrate cause of the problem is still >> unclear for us! >> So, Anyone give any solution/suggestion/opinion whatever will be highly >> highly appreciated!!! >> >> Problem Summary: >> When we activate osd, there will be metadata corrupttion in the >> activating disk, probability is 100% ! >> >> Admin Nodes&MON node: >> Platform: X86 >> OS: Ubuntu 16.04 >> Kernel: 4.12.0 >> Ceph: Luminous 12.2.2 >> >> OSD nodes: >> Platform: armv7 >> OS: Ubuntu 14.04 >> Kernel: 4.4.39 >> Ceph: Lominous 12.2.2 >> Disk: 10T+10T >> Memory: 2GB >> >> Deploy log: >> >> >> dmesg log:(Sorry arms001-01 dmesg log has log has been lost, but error >> message about metadata corruption on arms003-10 are the same with >> arms001-01) >> Mar 5 11:08:49 arms003-10 kernel: [ 252.534232] XFS (sda1): Unmount and >> run xfs_repair >> Mar 5 11:08:49 arms003-10 kernel: [ 252.539100] XFS (sda1): First 64 >> bytes of corrupted metadata buffer: >> Mar 5 11:08:49 arms003-10 kernel: [ 252.545504] eb82f000: 58 46 53 42 00 >> 00 10 00 00 00 00 00 91 73 fe fb XFSB.........s.. >> Mar 5 11:08:49 arms003-10 kernel: [ 252.553569] eb82f010: 00 00 00 00 00 >> 00 00 00 00 00 00 00 00 00 00 00 ................ >> Mar 5 11:08:49 arms003-10 kernel: [ 252.561624] eb82f020: fc 4e e3 89 50 >> 8f 42 aa be bc 07 0c 6e fa 83 2f .N..P.B.....n../ >> Mar 5 11:08:49 arms003-10 kernel: [ 252.569706] eb82f030: 00 00 00 00 80 >> 00 00 07 ff ff ff ff ff ff ff ff ................ >> Mar 5 11:08:49 arms003-10 kernel: [ 252.577778] XFS (sda1): metadata I/O >> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8 >> Mar 5 11:08:49 arms003-10 kernel: [ 252.602944] XFS (sda1): Metadata >> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data >> block 0x48b9ff80 >> Mar 5 11:08:49 arms003-10 kernel: [ 252.614170] XFS (sda1): Unmount and >> run xfs_repair >> Mar 5 11:08:49 arms003-10 kernel: [ 252.619030] XFS (sda1): First 64 >> bytes of corrupted metadata buffer: >> Mar 5 11:08:49 arms003-10 kernel: [ 252.625403] eb901000: 58 46 53 42 00 >> 00 10 00 00 00 00 00 91 73 fe fb XFSB.........s.. >> Mar 5 11:08:49 arms003-10 kernel: [ 252.633441] eb901010: 00 00 00 00 00 >> 00 00 00 00 00 00 00 00 00 00 00 ................ >> Mar 5 11:08:49 arms003-10 kernel: [ 252.641474] eb901020: fc 4e e3 89 50 >> 8f 42 aa be bc 07 0c 6e fa 83 2f .N..P.B.....n../ >> Mar 5 11:08:49 arms003-10 kernel: [ 252.649519] eb901030: 00 00 00 00 80 >> 00 00 07 ff ff ff ff ff ff ff ff ................ >> Mar 5 11:08:49 arms003-10 kernel: [ 252.657554] XFS (sda1): metadata I/O >> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8 >> Mar 5 11:08:49 arms003-10 kernel: [ 252.675056] XFS (sda1): Metadata >> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data >> block 0x48b9ff80 >> Mar 5 11:08:49 arms003-10 kernel: [ 252.686228] XFS (sda1): Unmount and >> run xfs_repair >> Mar 5 11:08:49 arms003-10 kernel: [ 252.691054] XFS (sda1): First 64 >> bytes of corrupted metadata buffer: >> Mar 5 11:08:49 arms003-10 kernel: [ 252.697425] eb901000: 58 46 53 42 00 >> 00 10 00 00 00 00 00 91 73 fe fb XFSB.........s.. >> Mar 5 11:08:49 arms003-10 kernel: [ 252.705459] eb901010: 00 00 00 00 00 >> 00 00 00 00 00 00 00 00 00 00 00 ................ >> Mar 5 11:08:49 arms003-10 kernel: [ 252.713489] eb901020: fc 4e e3 89 50 >> 8f 42 aa be bc 07 0c 6e fa 83 2f .N..P.B.....n../ >> Mar 5 11:08:49 arms003-10 kernel: [ 252.721520] eb901030: 00 00 00 00 80 >> 00 00 07 ff ff ff ff ff ff ff ff ................ >> Mar 5 11:08:49 arms003-10 kernel: [ 252.729558] XFS (sda1): metadata I/O >> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8 >> Mar 5 11:08:49 arms003-10 kernel: [ 252.741953] XFS (sda1): Metadata >> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data >> block 0x48b9ff80 >> Mar 5 11:08:49 arms003-10 kernel: [ 252.753139] XFS (sda1): Unmount and >> run xfs_repair >> Mar 5 11:08:49 arms003-10 kernel: [ 252.757955] XFS (sda1): First 64 >> bytes of corrupted metadata buffer: >> Mar 5 11:08:49 arms003-10 kernel: [ 252.764336] eb901000: 58 46 53 42 00 >> 00 10 00 00 00 00 00 91 73 fe fb XFSB.........s.. >> Mar 5 11:08:49 arms003-10 kernel: [ 252.772365] eb901010: 00 00 00 00 00 >> 00 00 00 00 00 00 00 00 00 00 00 ................ >> Mar 5 11:08:49 arms003-10 kernel: [ 252.780395] eb901020: fc 4e e3 89 50 >> 8f 42 aa be bc 07 0c 6e fa 83 2f .N..P.B.....n../ >> Mar 5 11:08:49 arms003-10 kernel: [ 252.788417] eb901030: 00 00 00 00 80 >> 00 00 07 ff ff ff ff ff ff ff ff ................ >> Mar 5 11:08:49 arms003-10 kernel: [ 252.796514] XFS (sda1): metadata I/O >> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8 >> >> Our tries for solving the problem: >> 1.Delploy osd manually, still got the same error has been confirmed. >> 2.Browse kernel bug fix log, but no related bug fix log has been found >> since kernel 4.4.39. >> 3.Upgrade xfsprogs from 3.1.9 to 4.15.0, error number changed, still but >> disk will be corrupted while activating osd! >> >> [2912641.987937] XFS (sda1): Metadata CRC error detected at >> xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data block 0xfffffff0 >> [2912641.999203] XFS (sda1): Unmount and run xfs_repair >> [2912642.004202] XFS (sda1): First 64 bytes of corrupted metadata buffer: >> [2912642.010759] e689a000: 58 46 53 42 00 00 10 00 00 00 00 00 91 73 fe fb >> XFSB.........s.. >> [2912642.018958] e689a010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> ................ >> [2912642.027177] e689a020: 61 7b 64 0d fa fe 41 14 bf ea 90 32 6c 73 e5 ad >> a{d...A....2ls.. >> [2912642.035388] e689a030: 00 00 00 00 50 00 00 08 ff ff ff ff ff ff ff ff >> ....P........... >> [2912642.043630] XFS (sda1): metadata I/O error: block 0xfffffff0 >> ("xfs_trans_read_buf_map") error 74 numblks 8 >> [2912642.060390] XFS (sda1): Metadata CRC error detected at >> xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data block 0xfffffff0 >> [2912642.071673] XFS (sda1): Unmount and run xfs_repair >> >> 4.Use the disk as OSD node on X86 will not trigger the the problem has >> been confirmed. >> 5.Use sgdisk & mkfs.xfs to format the disk, and mount do some read&write >> dd test then unmount, will not trigger the problem has been confirmed. >> 6.Chang ceph version form 12.2.2 to 10.2.10, the problem still exist has >> been confirmed. >> 10.2.0 Deploy log >> >> Corruption error is the same as 12.2.2. >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com