On Sun, Jul 22, 2012 at 10:05 AM, Victor Meyerson <calculuspenguin@xxxxxxxxx> wrote: > Hi, > > I recently found a bug related to direct io in post 3.3 linux kernels. Fortunately, my hardware (a Cobalt Qube2) is supported by the vanilla kernel so I did not need additional patch sets to get the machine to boot. I ran git bisect on the main tree[1] and tested the various bisect results until git reported the first bad commit. After several bisects and many reboots, git reported that [2] was the first bad commit. > > In testing this I came up with a repeatable process. Unfortunately, I do not have any other MIPS hardware to test this on and I believe that based on the commit in question that it is MIPS related. My procedure is as follows: > > 1) Create a random file to be used on the two kernels (one before the commit, and one that includes the commit) > $ dd if=/dev/urandom of=random-file bs=512 count=30720 > 30720+0 records in > 30720+0 records out > 15728640 bytes (16 MB) copied, 60.7035 s, 259 kB/s > $ chmod -w random-file > > 2) Reboot to the kernel before the commit and run dd with direct io. Repeat. > $ uname -a > Linux horadric 3.2.0-dirty #2 Fri Jul 13 06:20:22 PDT 2012 mips64 Nevada V10.0 FPU V10.0 Cobalt Qube2 GNU/Linux > $ dd if=random-file of=portion-of-random-3.2.0 bs=512 count=20480 iflag=direct > 20480+0 records in > 20480+0 records out > 10485760 bytes (10 MB) copied, 42.3636 s, 248 kB/s > $ reboot > $ dd if=random-file of=portion-of-random-3.2.0-2 bs=512 count=20480 iflag=direct > 20480+0 records in > 20480+0 records out > 10485760 bytes (10 MB) copied, 42.5252 s, 247 kB/s > > 3) Reboot to the kernel with the commit and run dd with direct io. Repeat. > $ uname -a > Linux horadric 3.2.0-rc4-00003-gb1c10be-dirty #15 Fri Jul 20 15:05:13 PDT 2012 mips64 Nevada V10.0 FPU V10.0 Cobalt Qube2 GNU/Linux > $ dd if=random-file of=portion-of-random-3.2.0-rc4 bs=512 count=20480 iflag=direct > 20480+0 records in > 20480+0 records out > 10485760 bytes (10 MB) copied, 40.6226 s, 258 kB/s > $ reboot > $ dd if=random-file of=portion-of-random-3.2.0-rc4-2 bs=512 count=20480 iflag=direct > 20480+0 records in > 20480+0 records out > 10485760 bytes (10 MB) copied, 40.8856 s, 256 kB/s > Hi Victor, Create files with dd if=random-file of=portion-of-random-3.2.0-rc4 bs=8k count=1280 iflag=direct dd if=random-file of=portion-of-random-3.2.0-rc4-2 bs=8k count=1280 iflag=direct without reboot(why reboot needed?), then see the changes in checksums. Thanks Hillf > 4) Compare checksums of the resulting files. > $ sha256sum portion-of-random-3.2.0* > c98a6e949b36448842a21f68e7c6a5daff1f161e1eb3e3529176cf56bf5af89e portion-of-random-3.2.0 > c98a6e949b36448842a21f68e7c6a5daff1f161e1eb3e3529176cf56bf5af89e portion-of-random-3.2.0-2 > dca27da87a78580b8a34bbff2790ae80d3aa880d5d00fc2126f109d6fff9e056 portion-of-random-3.2.0-rc4 > 703cf02d4fa90679d4a75900e7e5a3b8c3000a65bfc475610b10f17bb88bedbc portion-of-random-3.2.0-rc4-2 > > Notice how the last two files have different checksums between themselves and even different from the first two files. This lead me to believe that there is a problem with direct io. All the files are the same size and should include the same portion of the random file created in step 1). > > My configuration is the Cobalt Qube2 with a 64-bit kernel and an n32 userspace. Hopefully someone with a much more deeper understanding of the kernel can confirm and provide a fix for this (assuming one has not been created yet). > > Thanks. Let me know if there is any additional information that may help with the investigation. > > Victor > > > [1] http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > [2] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=b1c10bea620f79109b5cc9935267bea4f6f29ac6