On Sunday 22 August 2004 18:24, Jari Ruusu wrote: > David Gümbel wrote: > > > Aug 21 15:30:09 [kernel] EXT3-fs error (device loop0): ext3_readdir: > > > directory # 802557 contains a hole at offset 348160 > > > Aug 21 15:30:09 [kernel] ext3_abort called. > > > Aug 21 15:30:09 [kernel] EXT3-fs error (device loop0) in > > > start_transaction: Jour > > > nal has aborted > > > - Last output repeated 3 times - > > > > So the problem doesn't seem to be related to register parameters. > > > > Any ideas? > > Even though I have been unable to reproduce these errors, I believe I > found the cause of these errors. > > Kernel 2.6.7 to 2.6.8.1 change involved block driver API change which is > rather nasty thing to do in "stable" kernel series. > > When file system code (or in this case loop code) sends I/O requests to > block driver, it fills up a struct bio that tells driver what it should > do. In this case two new entries were added to struct bio, and 2.6 loop > code didn't initialize those value to sane default values that I/O > elevator code was expecting. IOW, elevator code was accessing a structure > that had two uninitialized variables. > > Does the patch below fix the problem? David, if you ack that this patch > fixes it for you, then I will have to make new version of loop-AES ASAP. I tried to apply the patch using patch, which didn't work (as far as I can see due to a messing up of TAB-vs.-Space, so I typed in the changes manually. A diff of the new vs. the old file reads like this (just in case I made any mistakes ;): marsupilami loop-AES-v2.1c # diff -u loop.c-2.6.patched.bak loop.c-2.6.patched-by-david --- loop.c-2.6.patched.bak 2004-08-23 10:46:54.000000000 +0200 +++ loop.c-2.6.patched-by-david 2004-08-23 16:06:03.000000000 +0200 @@ -508,6 +508,10 @@ bio->bi_phys_segments = 0; bio->bi_hw_segments = 0; bio->bi_size = len = orig_bio->bi_io_vec[merge->bi_idx].bv_len; +#if defined(BIOVEC_VIRT_START_SIZE) + bio->bi_hw_front_size = 0; + bio->bi_hw_back_size = 0; +#endif /* bio->bi_max_vecs not touched */ bio->bi_io_vec[0].bv_len = len; bio->bi_io_vec[0].bv_offset = 0; Anyway, I am getting the old error again, pretty soon after bootup and starting a compile to get some load on the system (on /usr and /var, both unencrypted ext3. loop0 is encrypted, device backed /home using ext3): Aug 23 18:33:08 [kernel] EXT3-fs error (device loop0): ext3_readdir: directory #783444 contains a hole at offset 4096 Aug 23 18:33:09 [kernel] ext3_abort called. Aug 23 18:35:18 [kernel] EXT3-fs error (device loop0) in start_transaction: Journal has aborted - Last output repeated 22 times - Aug 23 18:35:34 [login(pam_unix)] session closed for user root Aug 23 18:35:35 [kernel] agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. Aug 23 18:35:40 [kernel] EXT3-fs error (device loop0) in start_transaction: Journal has aborted In addition, I noticed that after a fresh bootup with 2.6.8.1-loop-AES-2.1c (fresh = bootup and root console login, [but KDM started]), when trying to do "ls -la /tmp", I was getting an error from ls saying it could not read /tmp (encrypted reiserfs, mounted fine). echo "test" > /tmp/tryit && cat /tmp/tryit worked fine, though, and so were "ls /tmp/tryit" and e.g. "file /tmp/tryit". So, no - unfortunately the patch doesn't fix the problem for me with 2.6.8.1, loop-AES-2.1c (and mregparm enabled, although that shouldn't matter). Kernel bootup parameters were "elevator=cfq" (like in the tests before), in case that is of any importance. Anyway, thank you, Jari, for your efforts so far. If you have any other ideas, please let me know. Regards, David
Attachment:
pgpewdEUkJoj4.pgp
Description: PGP signature