----- Original Message ----- > From: Hillf Danton <dhillf@xxxxxxxxx> > To: Victor Meyerson <calculuspenguin@xxxxxxxxx> > Cc: "linux-mips@xxxxxxxxxxxxxx" <linux-mips@xxxxxxxxxxxxxx>; Ralf Baechle <ralf@xxxxxxxxxxxxxx>; LKML <linux-kernel@xxxxxxxxxxxxxxx> > Sent: Tuesday, July 24, 2012 6:04 AM > Subject: Re: Direct I/O bug in kernel > > On Sun, Jul 22, 2012 at 10:05 AM, Victor Meyerson > <calculuspenguin@xxxxxxxxx> wrote: >> Hi, >> >> I recently found a bug related to direct io in post 3.3 linux kernels. > Fortunately, my hardware (a Cobalt Qube2) is supported by the vanilla kernel so > I did not need additional patch sets to get the machine to boot. I ran git > bisect on the main tree[1] and tested the various bisect results until git > reported the first bad commit. After several bisects and many reboots, git > reported that [2] was the first bad commit. >> >> In testing this I came up with a repeatable process. Unfortunately, I do > not have any other MIPS hardware to test this on and I believe that based on the > commit in question that it is MIPS related. My procedure is as follows: >> >> 1) Create a random file to be used on the two kernels (one before the > commit, and one that includes the commit) >> $ dd if=/dev/urandom of=random-file bs=512 count=30720 >> 30720+0 records in >> 30720+0 records out >> 15728640 bytes (16 MB) copied, 60.7035 s, 259 kB/s >> $ chmod -w random-file >> >> 2) Reboot to the kernel before the commit and run dd with direct io. > Repeat. >> $ uname -a >> Linux horadric 3.2.0-dirty #2 Fri Jul 13 06:20:22 PDT 2012 mips64 Nevada > V10.0 FPU V10.0 Cobalt Qube2 GNU/Linux >> $ dd if=random-file of=portion-of-random-3.2.0 bs=512 count=20480 > iflag=direct >> 20480+0 records in >> 20480+0 records out >> 10485760 bytes (10 MB) copied, 42.3636 s, 248 kB/s >> $ reboot >> $ dd if=random-file of=portion-of-random-3.2.0-2 bs=512 count=20480 > iflag=direct >> 20480+0 records in >> 20480+0 records out >> 10485760 bytes (10 MB) copied, 42.5252 s, 247 kB/s >> >> 3) Reboot to the kernel with the commit and run dd with direct io. Repeat. >> $ uname -a >> Linux horadric 3.2.0-rc4-00003-gb1c10be-dirty #15 Fri Jul 20 15:05:13 PDT > 2012 mips64 Nevada V10.0 FPU V10.0 Cobalt Qube2 GNU/Linux >> $ dd if=random-file of=portion-of-random-3.2.0-rc4 bs=512 count=20480 > iflag=direct >> 20480+0 records in >> 20480+0 records out >> 10485760 bytes (10 MB) copied, 40.6226 s, 258 kB/s >> $ reboot >> $ dd if=random-file of=portion-of-random-3.2.0-rc4-2 bs=512 count=20480 > iflag=direct >> 20480+0 records in >> 20480+0 records out >> 10485760 bytes (10 MB) copied, 40.8856 s, 256 kB/s >> > Hi Victor, > > Create files with > > dd if=random-file of=portion-of-random-3.2.0-rc4 bs=8k > count=1280 iflag=direct > dd if=random-file of=portion-of-random-3.2.0-rc4-2 bs=8k > count=1280 iflag=direct > > without reboot(why reboot needed?), then see the changes in checksums. > > Thanks > Hillf > Hi Hillf, I rebooted in an attempt to make sure nothing was cached between runs. In any case, here are the results without a reboot: $ dd if=random-file of=portion-of-random-3.2.0-rc4 bs=8k count=1280 iflag=direct 1280+0 records in 1280+0 records out 10485760 bytes (10 MB) copied, 6.00599 s, 1.7 MB/s $ dd if=random-file of=portion-of-random-3.2.0-rc4-2 bs=8k count=1280 iflag=direct 1280+0 records in 1280+0 records out 10485760 bytes (10 MB) copied, 5.25964 s, 2.0 MB/s $ sha256sum portion-of-random-3.2.0-rc4* 4c56820030ce22e6cc96127a53f6025d11a78f2fd3b0c1dec44f6d6746f70bbd portion-of-random-3.2.0-rc4 05c41d626a67b9bcddb0e7b905533c63a0866092b819bf01cdb2a80f29c2b162 portion-of-random-3.2.0-rc4-2 Still different checksums and I used the same random-file from my first test. Victor >> 4) Compare checksums of the resulting files. >> $ sha256sum portion-of-random-3.2.0* >> c98a6e949b36448842a21f68e7c6a5daff1f161e1eb3e3529176cf56bf5af89e > portion-of-random-3.2.0 >> c98a6e949b36448842a21f68e7c6a5daff1f161e1eb3e3529176cf56bf5af89e > portion-of-random-3.2.0-2 >> dca27da87a78580b8a34bbff2790ae80d3aa880d5d00fc2126f109d6fff9e056 > portion-of-random-3.2.0-rc4 >> 703cf02d4fa90679d4a75900e7e5a3b8c3000a65bfc475610b10f17bb88bedbc > portion-of-random-3.2.0-rc4-2 >> >> Notice how the last two files have different checksums between themselves > and even different from the first two files. This lead me to believe that there > is a problem with direct io. All the files are the same size and should include > the same portion of the random file created in step 1). >> >> My configuration is the Cobalt Qube2 with a 64-bit kernel and an n32 > userspace. Hopefully someone with a much more deeper understanding of the > kernel can confirm and provide a fix for this (assuming one has not been created > yet). >> >> Thanks. Let me know if there is any additional information that may help > with the investigation. >> >> Victor >> >> >> [1] http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git >> [2] > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=b1c10bea620f79109b5cc9935267bea4f6f29ac6 >