On 11/1/19 1:48 PM, Jason Gunthorpe wrote: > On Wed, Oct 30, 2019 at 12:55:37PM -0400, Boris Ostrovsky wrote: >> On 10/28/19 4:10 PM, Jason Gunthorpe wrote: >>> From: Jason Gunthorpe <jgg@xxxxxxxxxxxx> >>> >>> gntdev simply wants to monitor a specific VMA for any notifier events, >>> this can be done straightforwardly using mmu_range_notifier_insert() over >>> the VMA's VA range. >>> >>> The notifier should be attached until the original VMA is destroyed. >>> >>> It is unclear if any of this is even sane, but at least a lot of duplicate >>> code is removed. >> I didn't have a chance to look at the patch itself yet but as a heads-up >> --- it crashes dom0. > Thanks Boris. I spent a bit of time and got a VM running with a xen > 4.9 hypervisor and a kernel with this patch series. It a ubuntu bionic > VM with the distro's xen stuff. > > Can you give some guidance how you made it crash? It crashes trying to dereference mrn->ops->invalidate in mn_itree_invalidate() when a guest exits. I don't think you've initialized notifier ops. I don't see you using gntdev_mmu_ops anywhere. -boris > I see the VM > autoloaded gntdev: > > Module Size Used by > xen_gntdev 24576 2 > xen_evtchn 16384 1 > xenfs 16384 1 > xen_privcmd 24576 16 xenfs > > And lsof says several xen processes have the chardev open: > > xenstored 819 root 13u CHR 10,53 0t0 19595 /dev/xen/gntdev > xenconsol 857 root 8u CHR 10,53 0t0 19595 /dev/xen/gntdev > xenconsol 857 860 root 8u CHR 10,53 0t0 19595 /dev/xen/gntdev > > But no crashing.. > > However, I wasn't able to get my usual debug kernel .config to boot > with the xen hypervisor, it crashes on early boot with: > > (XEN) Dom0 has maximum 8 VCPUs > (XEN) Scrubbing Free RAM on 1 nodes using 8 CPUs > (XEN) .done. > (XEN) Initial low memory virq threshold set at 0x1000 pages. > (XEN) Std. Loglevel: All > (XEN) Guest Loglevel: All > (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen) > (XEN) Freed 468kB init memory > (XEN) d0v0 Unhandled page fault fault/trap [#14, ec=0002] > (XEN) Pagetable walk from fffffbfff0480fbe: > (XEN) L4[0x1f7] = 0000000000000000 ffffffffffffffff > (XEN) domain_crash_sync called from entry.S: fault at ffff82d080348a06 entry.o#create_bounce_frame+0x135/0x15f > (XEN) Domain 0 (vcpu#0) crashed on cpu#0: > (XEN) ----[ Xen-4.9.2 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e033:[<ffffffff82b9f731>] > (XEN) RFLAGS: 0000000000000296 EM: 1 CONTEXT: pv guest (d0v0) > (XEN) rax: fffffbfff0480fbe rbx: 0000000000000000 rcx: 00000000c0000101 > (XEN) rdx: 00000000ffffffff rsi: ffffffff84026000 rdi: ffffffff82cb4a20 > (XEN) rbp: ffffffff82407ff8 rsp: ffffffff82407da0 r8: 0000000000000000 > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000 > (XEN) r12: 0000000000000000 r13: 1ffffffff0480fbe r14: 0000000000000000 > (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000003506e0 > (XEN) cr3: 0000000034027000 cr2: fffffbfff0480fbe > (XEN) fsb: 0000000000000000 gsb: ffffffff82b61000 gss: 0000000000000000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 > > Which is surely some .config issue, but I didn't figure out what. > > Jason