EXT4-fs: Intermitent segfault with memory corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Ext4 Developers,

I hope this email finds you well. We are reaching out to report a
persistent issue that we have been facing on Windows Subsystem for
Linux (WSL)[1] with various kernel versions. We have encountered the
problem on kernel versions v5.15, v6.1, v6.6 stable kernels, and also
the current upstream kernel. While the issue takes longer to reproduce
on v5.15, it is consistently observable across these versions.

Issue Description:
Intermittent segfault with memory corruption. The time of segfault and
output can vary, though one of the notable failures manifests as a
segfault with the following error message:

EXT4-fs error (device sdc): ext4_find_dest_de:2092: inode #32168:
block 2334198: comm dpkg: bad entry in directory: rec_len is smaller
than minimal - offset=0, inode=0, rec_len=0, size=4084 fake=0

and

EXT4-fs warning (device sdc): dx_probe:890: inode #27771: comm dpkg:
dx entry: limit 0 != root limit 508
EXT4-fs warning (device sdc): dx_probe:964: inode #27771: comm dpkg:
Corrupt directory, running e2fsck is recommended
EXT4-fs error (device sdc): ext4_empty_dir:3098: inode #27753: block
133944722: comm dpkg: bad entry in directory: rec_len is smaller than
minimal - offset=0, inode=0, rec_len=0, size=4096 fake=0

or we see a segfault message where the source can change depending on
which command we're testing with (dpkg, apt, gcc..):

dpkg[135]: segfault at 0 ip 00007f9209eb6a19 sp 00007ffd8a6a0b08 error
4 in libc-2.31.so[7f9209d6e000+159000] likely on CPU 1 (core 0, socket
0)

Reproduction Steps:
The steps to reproduce the issue are seemingly straightforward: Run an
install or upgrade. The larger the change the better.

Installing Gimp has been a go to for testing, though we have
reproduced the failure with:
- apt upgrade
- apt install
- dpkg install
- gcc building source files

Observations:

The issue occurs consistently across multiple kernel versions.
Reproduction is faster on more recent kernels.
Longer intervals are required for v5.15.
When adding more debugging options that increases processing time,
segfault seems to be harder to hit.

When DX_DEBUG is enabled, during the installation process(dpkg
install), we observed instances where both rlen and de->name_len
values become 0.

We wanted to bring this to your attention and seek guidance on how we
could proceed with debugging and resolving this issue. Your insights
and assistance would be greatly appreciated.

Thank you for your time and consideration.


[1] What is Windows Subsystem for Linux:
    https://learn.microsoft.com/en-us/windows/wsl/about

-- 
       - Allen




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux