Hi Darrick, (2010/04/06 7:02), Darrick J. Wong wrote:
Hi all, I wrote a program called e4frag that deliberately tries to fragment an ext4 filesystem via EXT4_IOC_MOVE_EXT so that I could run e4defrag through its paces. While running e4frag and e4defrag concurrently on a kernel source tree, I discovered ongoing file corruption. It appears that if e4frag and e4defrag hit the same file at same time, the file ends up with a 4K data block from somewhere else. "Somewhere else" seems to be a small chunk of binary gibberish followed by contents from other files(!) Obviously this isn't a good thing to see, since today it's header files but tomorrow it could be the credit card/SSN database. :) Ted asked me to send out a copy of the program ASAP, so the test program source code is at the end of this message. To build it, run: $ gcc -o e4frag -O2 -Wall e4frag.c and then to run it: (unpack something in /path/to/files) $ cp -pRdu /path/to/files /path/to/intact_files $ while true; do e4defrag /path/to/files& done $ while true; do ./e4frag -m 500 -s random /path/to/files& done $ while true; do diff -Naurp /path/to/intact_files /path/to/files; done ...and wait for diff to cough up differences. This seems to happen on 2.6.34-rc3, and only if e4frag and e4defrag are running concurrently. Running e4frag or e4defrag in a serial loop doesn't produce this corruption, so I think it's purely a concurrent access problem.
I couldn't reproduce this problem, somehow. My environment is: Arch: i386 Kernel: 2.6.34-rc3 e2fsprogs: 1.41.11 Mount option: delalloc, data=ordered, async Block size: 4KB Partition size: 100GB Is there any difference in your case? And how long does this file corruption take to be detected? I ran below program all day long, but problem did not occur. --- #!/bin/bash TARGET="/mnt/mp1/TEST/linux-2.6.34-rc3" ORIG="/mnt/mp1/TEST/linux-2.6.34-rc3-orig" cp -pRdu $TARGET $ORIG while true; do ./e4defrag -v $TARGET & done while true; do ./e4frag -m 500 -s random $TARGET & done while true; do diff -Naurp $ORIG $TARGET; done --- # The OOM killer sometimes runs while running this program because this is a heavy load for system, though. Regards, Akira Fujita -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html