On Tue, 14 Feb 2023 16:54:02 -0500 Gregory Price <gregory.price@xxxxxxxxxxxx> wrote: > On Tue, Feb 14, 2023 at 04:51:53PM -0500, Gregory Price wrote: > > On Tue, Feb 14, 2023 at 09:18:24PM +0000, Jonathan Cameron wrote: > > > On Tue, 14 Feb 2023 14:01:23 -0500 > > > Gregory Price <gregory.price@xxxxxxxxxxxx> wrote: > > > > > > Could you test it with TCG (just drop --enable-kvm)? We have a known > > > limitation with x86 instructions running out of CXL emulated memory > > > (side effect of emulating the interleave). You'll need a fix even on TCG > > > for the corner case of an instruction bridging from normal ram to cxl memory. > > > https://lore.kernel.org/qemu-devel/20230206193809.1153124-1-richard.henderson@xxxxxxxxxx/ > > > > > > Performance will be bad, but so far this is only way we can do it correctly. > > > > > > Jonathan > > > > > > > Siiiggghh... i had this patch and dropped --enable-kvm, but forgot to > > drop "accel=kvm" from the -machine line > > > > This was the issue. > > > > And let me tell you, if you numactl --membind=1 python, it is > > IMPRESSIVELY slow. I wonder if it's even hitting a few 100k > > instructions a second. > > > > > > This appears to be the issue. When I get a bit more time, try to dive > > into the deep dark depths of qemu memory regions to see how difficult > > a non-mmio fork might be, unless someone else is already looking at it. > > > > ~Gregory > > Just clarifying one thing: Even with the patch, KVM blows up. > Disabling KVM fixes this entirely. I haven't tested without KVM but > with the patch, i will do that now. yup. The patch only fixes TCG so that's expected behavior. Fingers crossed on this 'working'. I'm open to suggestions on how to work around the problem with KVM or indeed allow TCG to cache the instructions (right not it has to fetch and emulate each instruction on it's own). I can envision how we might do it for KVM with userspace page fault handling used to get a fault up to QEMU which can then stitch in a cache of the underlying memory as a stage 2 translation to the page (a little bit like how post migration copy works) though I've not prototyped anything... I think it would be complex code that would be little used so we may just have to cope with the emulation being slow. Intent is very much to be able to test the kernel code etc, not test it quickly :) Jonathan