On Thursday, 2024-08-15 at 09:14:27 +02, Linux regression tracking (Thorsten Leemhuis) wrote: > [side note: the message I have been replying to at least when downloaded > from lore has two message-ids, one of them identical two a older > message, which is why this looks odd in the lore archives: > https://lore.kernel.org/all/20240511031404.30903-1-xuanzhuo@xxxxxxxxxxxxxxxxx/] > Yes, I saw that too, hence I responded to patch 1 in the series, rather than the cover letter. > On 14.08.24 08:59, Michael S. Tsirkin wrote: >> Note: Xuan Zhuo, if you have a better idea, pls post an alternative >> patch. >> >> Note2: untested, posting for Darren to help with testing. >> >> Turns out unconditionally enabling premapped >> virtio-net leads to a regression on VM with no ACCESS_PLATFORM, and with >> sysctl net.core.high_order_alloc_disable=1 >> >> where crashes and scp failures were reported (scp a file 100M in size to VM): >> [...] > > TWIMC, there is a regression report on lore and I wonder if this might > be related or the same problem, as it also mentioned a "get_swap_device: > Bad swap file entry" error: > https://bugzilla.kernel.org/show_bug.cgi?id=219154 > I took a look at the stack traces, they don't look similar to what I was seeing, but I wasn't running with an ASAN enabled in the kernel. Most of the traces that I was seeing would look like as in the e-mail from Si-Wei: https://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@xxxxxxxxxx/ We could trigger it only when the sysctl value was set like: - net.core.high_order_alloc_disable=1 And it would immediately panic on any relatively large download, e.g. wget of a few RPMS, or similar. Best I can suggest would be to try reverting them in a custom kernel and see if it fixes this problem too. Thanks, Darren. > To quote: > > """ > Hello, > > I've encountered repeated crashes or freezes when a KVM VM receives > large amounts of data over the network while the system is under memory > load and performing I/O operations. The crashes sometimes occur in the > filesystem code (ext4 and btrfs, at least), but they also happen in > other locations. > > This issue occurs on my custom builds using kernel versions v6.10 to > v6.11-rc2, with virtio network and disk drivers, and either Ubuntu 22.04 > or Debian 12 user space. > > The same kernel build did not crash on an Azure VM, which does not use > the virtio network driver. Since this issue only appears when receiving > data, I suspect there could be an issue related to the virtio interface > or receive buffer handling. > > This issue did not occur on the Debian backport kernel 6.9.7-1~bpo12+1 > amd64. > > Steps to Reproduce: > 1. Setup a small VM on a KVM host. > I tested this on an x86_64 KVM VM with 1 CPU, 512 MB RAM, 2 GB SWAP > (the smallest configuration from Vultr), using a Debian 12 user space, > virtio disk, and virtio net. > 2. Induce high memory and I/O load. Run the following command: > stress --vm 2 --hdd 1 > (Adjust --vm to to occupy all the RAM) > This slows down the system but does not cause a crash. > 3. Send large data to the VM. > I used `iperf3 -s` on the VM and sent data using `iperf3 -c` from > another host. The system crashes within a few seconds to a few minutes. > (The reverse direction `iperf3 -c -R` did not cause a crash.) > > > The OOPS messages are mostly general protection faults, but sometimes I > see "Bad pagetable" or other errors, such as: > Oops: general protection fault, probably for non-canonical address > 0x2f9b7fa5e2bde696: 0000 [#1] PREEMPT SMP PTI > Oops: Oops: 0000 [#1] PREEMPT SMP PTI > Oops: Bad pagetable: 000d [#1] PREEMPT SMP PTI > > In some cases, dmesg contains something like: > UBSAN: shift-out-of-bounds in lib/xarray.c:158:34 > > When the system freezes without crash, I sometimes found BUGON messages > in some cases, such as: > get_swap_device: Bad swap file entry 3403b0f5b2584992 > BUG: Bad page map in process stress pte:c42f93fac0299e1d pmd:0d9b2047 > BUG: Bad rss-counter-state mm:000000004df3dd9a type:MM_ANONPAGES val:2 > BUG: Bad rss-counter-state mm:000000004df3dd9a type:MM_SWAPENTS val:-1 > > Thanks. > """ > > Ciao, Thorsten