Re: fscache corruption in Linux 5.17?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Max Kellermann <mk@xxxxxxxxxx> wrote:

> At least one web server is still in this broken state right now.  So
> if you need anything from that server, tell me, and I'll get it.

Can you turn on:

echo 65536 >/sys/kernel/debug/tracing/buffer_size_kb
echo 1 >/sys/kernel/debug/tracing/events/cachefiles/cachefiles_read/enable
echo 1 >/sys/kernel/debug/tracing/events/cachefiles/cachefiles_write/enable
echo 1 >/sys/kernel/debug/tracing/events/cachefiles/cachefiles_trunc/enable
echo 1 >/sys/kernel/debug/tracing/events/cachefiles/cachefiles_io_error/enable
echo 1 >/sys/kernel/debug/tracing/events/cachefiles/cachefiles_vfs_error/enable

Then try and trigger the bug if you can.  The trace can be viewed with:

cat /sys/kernel/debug/tracing/trace | less

The problem very likely happens on write rather than read.  If you know of a
file that's corrupt, turn on the tracing above and read that file.  Then look
in the trace buffer and you should see the corresponding lines and they should
have the backing inode in them, marked "B=iiii" where "iiii" is the inode
number of the file in hex.  You should be able to examine the backing file by
finding it with something like:

	find /var/cache/fscache -inum $((0xiiii))

and see if you can see the corruption in there.  Note that there may be blocks
of zeroes corresponding to unfetched file blocks.

Also, what filesystem is backing your cachefiles cache?  It could be useful to
dump the extent list of the file.  You should be able to do this with
"filefrag -e".

As to why this happens, a write that's misaligned by 31 bytes should cause DIO
to a disk to fail - so it shouldn't be possible to write that.  However, I'm
doing fallocate and truncate on the file to shape it so that DIO will work on
it, so it's possible that there's a bug there.  The cachefiles_trunc trace
lines may help catch that.

David





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux