On Fri 20-03-20 11:04:50, Ritesh Harjani wrote: > On 3/19/20 6:54 PM, Ritesh Harjani wrote: > > On 3/18/20 9:17 AM, Aneesh Kumar K.V wrote: > > > Hi, > > > > > > With new vm install I am finding corruption with the vm image if I > > > follow up the install with echo 3 > /proc/sys/vm/drop_caches > > > > > > The file system reports below error. > > > > > > Begin: Running /scripts/local-bottom ... done. > > > Begin: Running /scripts/init-bottom ... > > > [ 4.916017] EXT4-fs error (device vda2): ext4_lookup:1700: inode > > > #787185: comm sh: iget: checksum invalid > > > done. > > > [ 5.244312] EXT4-fs error (device vda2): ext4_lookup:1700: inode > > > #917954: comm init: iget: checksum invalid > > > [ 5.257246] EXT4-fs error (device vda2): ext4_lookup:1700: inode > > > #917954: comm init: iget: checksum invalid > > > /sbin/init: error while loading shared libraries: libc.so.6: cannot > > > open shared object file: Error 74 > > > [ 5.271207] Kernel panic - not syncing: Attempted to kill init! > > > exitcode=0x00007f00 > > > > > > And debugfs reports > > > > > > debugfs: stat <917954> > > > Inode: 917954 Type: bad type Mode: 0000 Flags: 0x0 > > > Generation: 0 Version: 0x00000000 > > > User: 0 Group: 0 Size: 0 > > > File ACL: 0 > > > Links: 0 Blockcount: 0 > > > Fragment: Address: 0 Number: 0 Size: 0 > > > ctime: 0x00000000 -- Wed Dec 31 18:00:00 1969 > > > atime: 0x00000000 -- Wed Dec 31 18:00:00 1969 > > > mtime: 0x00000000 -- Wed Dec 31 18:00:00 1969 > > > Size of extra inode fields: 0 > > > Inode checksum: 0x00000000 > > > BLOCKS: > > > debugfs: > > > > > > Bisecting this finds > > > Commit 244adf6426ee31a83f397b700d964cff12a247d3("ext4: make > > > dioread_nolock the default") > > > as bad. If I revert the same on top of linus > > > upstream(fb33c6510d5595144d585aa194d377cf74d31911) > > > I don't hit the corrupttion anymore. > > > > Tried replicating this and could easily replicate it on Power box. > > I tried to reproduce this on x86 too, but could not reproduce on x86. > > Now one difference on Power could be that pagesize is 64K and fs > > blocksize is 4K. > > > > The issue looks like the guest qemu image file is not properly written > > back, after host does echo 3 > drop_caches. (correct me if this is not > > the case). > > Ok. So tried this issue with passing "cache=directsync" parameter to > drive file. This parameter says it should bypass the host side page > cache. With this parameter, I don't see this issue on Power box. OK, so this likely means that there is something hosed in the writeback path using unwritten extents when blocksize < pagesize. Maybe we miss some conversion of unwritten extent to a written one and thus after dropping caches we effectively loose data? Honza > > I tried replicating via below test, but it could not reproduce. > > > > Any idea what kind of unit test could be written for this? > > I am not sure how exactly qemu is writing to it's image file. > > > > > > 1. Create 2 files. "mmap-file", "mmap-data". > > 2. "mmap-file" is a 2GB sparse file. Then at some random offsets (tried > > with both 64KB align and 4KB align offsets), try to write > > pagesize/blocksize amount of known data pattern. > > 3. These offsets (which are pagesize/blocksize align) are recorded into > > "mmap-data" file via normal read/write calls. > > 4. Then after we wrote to both files, we munmap the "mmap-file" and > > close both of these files. > > 5. Then we do echo 3 > drop_caches. > > 6. Then in the verify phase, using the offsets written in "mmap-data" > > file, I read the "mmap-file" to verify if it's contents are proper or > > not. > > With that could not reproduce this issue. > > > > > > -ritesh > > > > > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR