Search Linux Wireless

Re: [syzbot] general protection fault in skb_dequeue (3)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02.02.23 09:52, David Howells wrote:
Hi John, David,

Could you have a look at this?

syzbot found the following issue on:

HEAD commit:    80bd9028feca Add linux-next specific files for 20230131
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1468e369480000
kernel config:  https://syzkaller.appspot.com/x/.config?x=904dc2f450eaad4a
dashboard link: https://syzkaller.appspot.com/bug?extid=a440341a59e3b7142895
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12c5d2be480000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11259a79480000
...
The issue was bisected to:

commit 920756a3306a35f1c08f25207d375885bef98975
Author: David Howells <dhowells@xxxxxxxxxx>
Date:   Sat Jan 21 12:51:18 2023 +0000

     block: Convert bio_iov_iter_get_pages to use iov_iter_extract_pages

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=170384f9480000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=148384f9480000
console output: https://syzkaller.appspot.com/x/log.txt?x=108384f9480000
...
general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 0 PID: 2838 Comm: kworker/u4:6 Not tainted 6.2.0-rc6-next-20230131-syzkaller-09515-g80bd9028feca #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023
Workqueue: phy4 ieee80211_iface_work
RIP: 0010:__skb_unlink include/linux/skbuff.h:2321 [inline]
RIP: 0010:__skb_dequeue include/linux/skbuff.h:2337 [inline]
RIP: 0010:skb_dequeue+0xf5/0x180 net/core/skbuff.c:3511

I don't think this is specifically related to anything networking.  I've run
it a few times and weird stuff happens in various places.  I'm wondering if
it's related to FOLL_PIN in some way.

The syzbot test in question does the following:

    #{"repeat":true,"procs":1,"slowdown":1,"sandbox":"none","sandbox_arg":0,"netdev":true,"cgroups":true,"close_fds":true,"usb":true,"wifi":true,"sysctl":true,"tmpdir":true}
    socket(0x0, 0x2, 0x0)
    epoll_create(0x7)
    r0 = creat(&(0x7f0000000040)='./bus\x00', 0x9)
    ftruncate(r0, 0x800)
    lseek(r0, 0x200, 0x2)
    r1 = open(&(0x7f0000000000)='./bus\x00', 0x24000, 0x0)  <-- O_DIRECT
    sendfile(r0, r1, 0x0, 0x1dd00)

Basically a DIO splice from a file to itself.

I've hand-written my own much simpler tester (see attached).  You need to run
at least two copies in parallel, I think, to trigger the bug.  It's possible
truncate is interfering somehow.

David
---
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/sendfile.h>
#include <sys/wait.h>

#define file_size 0x800
#define send_size 0x1dd00
#define repeat_count 1000

int main(int argc, char *argv[])
{
	int in, out, i, wt;

	if (argc != 2 || !argv[1][0]) {
		fprintf(stderr, "Usage: %s <file>\n", argv[0]);
		exit(2);
	}

	for (i = 0; i < repeat_count; i++) {
		switch (fork()) {
		case -1:
			perror("fork");
			exit(1);
		case 0:
			out = creat(argv[1], 0666);
			if (out < 0) {
				perror(argv[1]);
				exit(1);
			}

			if (ftruncate(out, file_size) < 0) {
				perror("ftruncate");
				exit(1);
			}

			if (lseek(out, file_size, SEEK_SET) < 0) {
				perror("lseek");
				exit(1);
			}

			in = open(argv[1], O_RDONLY | O_DIRECT | O_NOFOLLOW);
			if (in < 0) {
				perror("open");
				exit(1);
			}

			if (sendfile(out, in, NULL, send_size) < 0) {
				perror("sendfile");
				exit(1);
			}
			exit(0);

[as raised on IRC]

At first, I wondered if that's related to shared anonymous pages getting pinned R/O that would trigger COW-unsharing ... but I don't even see where we are supposed to use FOLL_PIN vs. FOLL_GET here? IOW, we're not even supposed to access user space memory (neither FOLL_GET nor FOLL_PIN) but still end up with a change in behavior.

--
Thanks,

David / dhildenb




[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux