On Thu, Jul 28, 2016 at 11:34 PM, Anatoly Pugachev <matorola@xxxxxxxxx> wrote: > On Thu, Jul 28, 2016 at 9:04 PM, David Sterba <dsterba@xxxxxxx> wrote: >> On Thu, Jul 28, 2016 at 04:28:41PM +0200, John Paul Adrian Glaubitz wrote: >>> On 07/28/2016 04:25 PM, John Paul Adrian Glaubitz wrote: >>> > On 07/28/2016 04:01 PM, Anatoly Pugachev wrote: >>> >> Program received signal SIGBUS, Bus error. >>> >> 0x0000000000177dfc in raid6_gen_syndrome (disks=4, bytes=65536, >>> >> ptrs=0x2c4510) at raid6.c:87 >>> >> 87 wq0 = wp0 = *(unative_t *)&dptr[z0][d+0*NSIZE]; >>> > >>> > That should be easy to fix. Just make the R values aligned with the >>> > appropriate get_aligned functions, see David's previous commit [1]: >>> >>> Argh, those are called get_UNaligned_*, not get_aligned_*. >>> >>> > There are more lines in raid6.c which need the same fix, basically everything >>> > with * (unative_t *). >>> >>> Oh, and you will somehow need to guard this with #if BITS_PER_LONG == 64 ... >>> #else ... #endif respectively since you need to use different versions >>> (64 vs. 32) of get_unaligned_* depending on the size of unative_t. >> >> And I've fixed it that way, now pushed to devel ("btrfs-progs: fix >> unaligned access in raid6 calculations" [1]). Would be great if you or >> Anatoly can test it so I can add it to the 4.7 release (ETA tomorrow). > > David, > well, I think mkfs.btrfs is fixed, since I just tested it with : > root@nvg5120:/home/mator/xfstests# ./check 'btrfs/06?' > FSTYP -- btrfs > PLATFORM -- Linux/sparc64 nvg5120 4.7.0+ > MKFS_OPTIONS -- /dev/loop0 > MOUNT_OPTIONS -- /dev/loop0 /mnt/scratch > > btrfs/060 145s > btrfs/061 158s > btrfs/062 288s > btrfs/063 141s > btrfs/064 129s > btrfs/065 44s > btrfs/066 46s > btrfs/067 - output mismatch (see > /home/mator/xfstests/results//btrfs/067.out.bad) > --- tests/btrfs/067.out 2016-07-20 12:12:21.772228422 +0300 > +++ /home/mator/xfstests/results//btrfs/067.out.bad 2016-07-28 > 22:54:00.059192629 +0300 > @@ -1,2 +1,3 @@ > QA output created by 067 > Silence is golden > +Scrub find errors in "-m single -d single" test > ... > (Run 'diff -u tests/btrfs/067.out > /home/mator/xfstests/results//btrfs/067.out.bad' to see the entire > diff) > btrfs/068 57s > btrfs/069 45s > Ran: btrfs/060 btrfs/061 btrfs/062 btrfs/063 btrfs/064 btrfs/065 > btrfs/066 btrfs/067 btrfs/068 btrfs/069 > Failures: btrfs/067 > Failed 1 of 10 tests > > > previously (before mkfs.btrfs fix) , all tests from 06? were bad/failed. > > Starting from "tests/btrfs/064" kernel started to log TPC (Trap > Program Counter register) messages, a lot of them. > > Results of the this test i put on a webserver [1]. > Output of journalctl -b (from boot) with TPC messages are at [2]. > > Not sure what we need to do with sparc64 btrfs module TPC messages. > Probably fill kernel bugzilla report? > > Thanks. > > [1] http://u163.east.ru/btrfs/xfstests-btrfs-06x-results.tar.gz > [2] http://u163.east.ru/btrfs/kernel-4.7.0+-logs-xfstests-06x.txt.gz > > PS: my xfstests setup is the following: > > # mount tmpfs -t tmpfs -o size=13g /ramdisk/ > /ramdisk# for i in 1 2 3 4 5 6; do fallocate -l 1g scratch${i}; done > /ramdisk# fallocate -l 4g testvol1 > > /ramdisk# for i in *; do losetup -f $i; done > /home/mator/xfstests# losetup > NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO > /dev/loop0 0 0 0 0 /ramdisk/scratch1 0 > /dev/loop1 0 0 0 0 /ramdisk/scratch2 0 > /dev/loop2 0 0 0 0 /ramdisk/scratch3 0 > /dev/loop3 0 0 0 0 /ramdisk/scratch4 0 > /dev/loop4 0 0 0 0 /ramdisk/scratch5 0 > /dev/loop5 0 0 0 0 /ramdisk/scratch6 0 > /dev/loop6 0 0 0 0 /ramdisk/testvol1 0 > > # mkfs.btrfs /dev/loop6 > btrfs-progs v4.6.1-66-g4367e35 > See http://btrfs.wiki.kernel.org for more information. > > Performing full device TRIM (4.00GiB) ... > Label: (null) > UUID: 6a4d5918-adfe-469c-8454-9b28545b88bc > Node size: 16384 > Sector size: 8192 > Filesystem size: 4.00GiB > Block group profiles: > Data: single 8.00MiB > Metadata: DUP 204.75MiB > System: DUP 8.00MiB > SSD detected: no > Incompat features: extref, skinny-metadata > Number of devices: 1 > Devices: > ID SIZE PATH > 1 4.00GiB /dev/loop6 > > root@nvg5120:/home/mator/xfstests# cat local.config > export TEST_DEV=/dev/loop6 > export TEST_DIR=/fst > export SCRATCH_DEV_POOL="/dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3 > /dev/loop4 /dev/loop5" > export SCRATCH_MNT=/mnt/scratch Just to add, I've also run tests from btrfs/000 to btrfs/059, with not so bad results: Ran: btrfs/001 btrfs/002 btrfs/005 btrfs/006 btrfs/008 btrfs/009 btrfs/010 btrfs/012 btrfs/013 btrfs/014 btrfs/015 btrfs/016 btrfs/017 btrfs/018 btrfs/019 btrfs/020 btrfs/021 btrfs/022 btrfs/023 btrfs/024 btrfs/025 btrfs/026 btrfs/027 btrfs/028 btrfs/029 btrfs/030 btrfs/031 btrfs/032 btrfs/033 btrfs/034 btrfs/035 btrfs/036 btrfs/037 btrfs/038 btrfs/039 btrfs/040 btrfs/041 btrfs/042 btrfs/043 btrfs/044 btrfs/045 btrfs/046 btrfs/048 btrfs/049 btrfs/050 btrfs/051 btrfs/052 btrfs/053 btrfs/054 btrfs/055 btrfs/056 btrfs/057 btrfs/058 btrfs/059 Not run: btrfs/003 btrfs/004 btrfs/007 btrfs/011 btrfs/047 Failures: btrfs/010 btrfs/012 btrfs/057 Failed 3 of 54 tests Failures: btrfs/010 - failed with "number of extents mis-match!" $ cat /home/mator/xfstests/results//btrfs/010.full Create subvolume '/mnt/scratch/subvol' 1+0 records in 1+0 records out 1+0 records in 1+0 records out 1+0 records in 1+0 records out 1+0 records in 1+0 records out 1+0 records in 1+0 records out Create a snapshot of '/mnt/scratch/subvol' in '/mnt/scratch/snap-2' Create a snapshot of '/mnt/scratch/subvol' in '/mnt/scratch/snap-1' /mnt/scratch/subvol/foobar: 0: [0..79]: 24704..24783 /mnt/scratch/snap-1/foobar: 0: [0..31]: 24672..24703 1: [32..47]: 24656..24671 2: [48..63]: 24608..24623 3: [64..79]: 24592..24607 /mnt/scratch/snap-2/foobar: 0: [0..31]: 24672..24703 1: [32..47]: 24656..24671 2: [48..63]: 24608..24623 3: [64..79]: 24592..24607 1 4 4 btrfs/012 - failed with "btrfs-convert failed" $ cat /home/mator/xfstests/results//btrfs/012.full mke2fs 1.43.1 (08-Jun-2016) Discarding device blocks: done Creating filesystem with 262144 4k blocks and 65536 inodes Filesystem UUID: 98d0756e-76b6-4ab1-ac7d-a1fceb4b21b4 Superblock backups stored on blocks: 32768, 98304, 163840, 229376 Allocating group tables: done Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done ERROR: system chunk array too big 1627389952 > 2048 ERROR: superblock checksum matches but it has invalid members No valid Btrfs found on /dev/loop0 unable to open ctree conversion aborted create btrfs filesystem: blocksize: 4096 nodesize: 16384 features: extref, skinny-metadata (default) btrfs-convert failed btrfs/057 - failed: '_scratch_mkfs -b 1g --nodesize 4096' $ cat /home/mator/xfstests/results//btrfs/057.full # _scratch_mkfs -b 1g --nodesize 4096 ERROR: illegal nodesize 4096 (smaller than 8192) failed: '_scratch_mkfs -b 1g --nodesize 4096' JFYI, mator@nvg5120:~$ getconf PAGE_SIZE 8192 this 000-059 tests was done with fresh reboot. within 027 test, kernel started to show TPC messages, like this one: Jul 29 12:10:32 nvg5120 unknown: run fstests btrfs/027 at 2016-07-29 12:10:32 ... Jul 29 12:10:58 nvg5120 kernel: BTRFS info (device loop4): allowing degraded mounts Jul 29 12:10:58 nvg5120 kernel: BTRFS info (device loop4): disk space caching is enabled Jul 29 12:10:58 nvg5120 kernel: BTRFS info (device loop4): has skinny extents Jul 29 12:10:59 nvg5120 kernel: BTRFS info (device loop4): dev_replace from <missing disk> (devid 2) to /dev/loop5 started Jul 29 12:10:59 nvg5120 kernel: Kernel unaligned access at TPC[118e002c] __btrfs_map_block+0x36c/0x1180 [btrfs] Jul 29 12:10:59 nvg5120 kernel: Kernel unaligned access at TPC[118e002c] __btrfs_map_block+0x36c/0x1180 [btrfs] Jul 29 12:10:59 nvg5120 kernel: Kernel unaligned access at TPC[118e002c] __btrfs_map_block+0x36c/0x1180 [btrfs] Jul 29 12:10:59 nvg5120 kernel: Kernel unaligned access at TPC[118e002c] __btrfs_map_block+0x36c/0x1180 [btrfs] Jul 29 12:10:59 nvg5120 kernel: Kernel unaligned access at TPC[118e002c] __btrfs_map_block+0x36c/0x1180 [btrfs] Jul 29 12:11:00 nvg5120 kernel: BTRFS info (device loop4): dev_replace from <missing disk> (devid 2) to /dev/loop5 finished ... Jul 29 12:11:07 nvg5120 kernel: BTRFS info (device loop4): allowing degraded mounts Jul 29 12:11:07 nvg5120 kernel: BTRFS info (device loop4): disk space caching is enabled Jul 29 12:11:07 nvg5120 kernel: BTRFS info (device loop4): has skinny extents Jul 29 12:11:08 nvg5120 kernel: BTRFS info (device loop4): dev_replace from <missing disk> (devid 2) to /dev/loop5 started Jul 29 12:11:08 nvg5120 kernel: log_unaligned: 10616 callbacks suppressed Jul 29 12:11:08 nvg5120 kernel: Kernel unaligned access at TPC[118e002c] __btrfs_map_block+0x36c/0x1180 [btrfs] Jul 29 12:11:09 nvg5120 kernel: Kernel unaligned access at TPC[118e002c] __btrfs_map_block+0x36c/0x1180 [btrfs] Jul 29 12:11:09 nvg5120 kernel: Kernel unaligned access at TPC[118e002c] __btrfs_map_block+0x36c/0x1180 [btrfs] Jul 29 12:11:09 nvg5120 kernel: Kernel unaligned access at TPC[118e0094] __btrfs_map_block+0x3d4/0x1180 [btrfs] Jul 29 12:11:09 nvg5120 kernel: Kernel unaligned access at TPC[118e0960] __btrfs_map_block+0xca0/0x1180 [btrfs] Jul 29 12:11:09 nvg5120 kernel: BTRFS info (device loop4): dev_replace from <missing disk> (devid 2) to /dev/loop5 finished Jul 29 12:11:11 nvg5120 mator[34598]: run xfstest btrfs/028 and only with 027 test, next tests were finished without TPC messages. -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html