I now have another issue. My binary fails to mmap a file within lkvm sandbox. The same binary works fine on host and in qemu. I've added strace into sandbox script, and here is the output: [pid 837] openat(AT_FDCWD, "syzkaller-shm048878722", O_RDWR|O_CLOEXEC) = 5 [pid 837] mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_SHARED, 5, 0) = -1 EINVAL (Invalid argument) I don't see anything that can potentially cause EINVAL here. Is it possible that lkvm somehow affects kernel behavior here? I run lkvm as: $ taskset 1 /kvmtool/lkvm sandbox --disk syz-0 --mem=2048 --cpus=2 --kernel /arch/x86/boot/bzImage --network mode=user --sandbox /workdir/kvm/syz-0.sh On Mon, Oct 19, 2015 at 4:20 PM, Sasha Levin <sasha.levin@xxxxxxxxxx> wrote: > On 10/19/2015 05:28 AM, Dmitry Vyukov wrote: >> On Mon, Oct 19, 2015 at 11:22 AM, Andre Przywara <andre.przywara@xxxxxxx> wrote: >>> Hi Dmitry, >>> >>> On 19/10/15 10:05, Dmitry Vyukov wrote: >>>> On Fri, Oct 16, 2015 at 7:25 PM, Sasha Levin <sasha.levin@xxxxxxxxxx> wrote: >>>>> On 10/15/2015 04:20 PM, Dmitry Vyukov wrote: >>>>>> Hello, >>>>>> >>>>>> I am trying to run a program in lkvm sandbox so that it communicates >>>>>> with a program on host. I run lkvm as: >>>>>> >>>>>> ./lkvm sandbox --disk sandbox-test --mem=2048 --cpus=4 --kernel >>>>>> /arch/x86/boot/bzImage --network mode=user -- /my_prog >>>>>> >>>>>> /my_prog then connects to a program on host over a tcp socket. >>>>>> I see that host receives some data, sends some data back, but then >>>>>> my_prog hangs on network read. >>>>>> >>>>>> To localize this I wrote 2 programs (attached). ping is run on host >>>>>> and pong is run from lkvm sandbox. They successfully establish tcp >>>>>> connection, but after some iterations both hang on read. >>>>>> >>>>>> Networking code in Go runtime is there for more than 3 years, widely >>>>>> used in production and does not have any known bugs. However, it uses >>>>>> epoll edge-triggered readiness notifications that known to be tricky. >>>>>> Is it possible that lkvm contains some networking bug? Can it be >>>>>> related to the data races in lkvm I reported earlier today? >>> >>> Just to let you know: >>> I think we have seen networking issues in the past - root over NFS had >>> issues IIRC. Will spent some time on debugging this and it looked like a >>> race condition in kvmtool's virtio implementation. I think pinning >>> kvmtool's virtio threads to one host core made this go away. However >>> although he tried hard (even by Will's standards!) he couldn't find a >>> the real root cause or a fix at the time he looked at it and we found >>> other ways to work around the issues (using virtio-blk or initrd's). >>> >>> So it's quite possible that there are issues. I haven't had time yet to >>> look at your sanitizer reports, but it looks like a promising approach >>> to find the root cause. >> >> >> Thanks, Andre! >> >> ping/pong does not hang within at least 5 minutes when I run lkvm >> under taskset 1. >> >> And, yeah, this pretty strongly suggests a data race. ThreadSanitizer >> can point you to the bug within a minute, so you just need to say >> "aha! it is here". Or maybe not. There are no guarantees. But if you >> already spent significant time on this, then checking the reports >> definitely looks like a good idea. > > Okay, that's good to know. > > I have a few busy days, but I'll definitely try to clear up these reports > as they seem to be pointing to real issues. > > > Thanks, > Sasha > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html