Re: Libvirt on little.BIG ARM systems unable to start guest if no cpuset is provided

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 14 Dec 2021 08:16:40 +0000,
Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:
> 
> 
> 
> On 2021/12/14 15:53, Michal Prívozník wrote:
> > On 12/14/21 01:41, Qu Wenruo wrote:
> >> 
> >> 
> >> On 2021/12/14 00:49, Marc Zyngier wrote:
> >>> On Mon, 13 Dec 2021 16:06:14 +0000,
> >>> Peter Maydell <peter.maydell@xxxxxxxxxx> wrote:
> >>>> 
> >>>> KVM on big.little setups is a kernel-level question really; I've
> >>>> cc'd the kvmarm list.
> >>> 
> >>> Thanks Peter for throwing us under the big-little bus! ;-)
> >>> 
> >>>> 
> >>>> On Mon, 13 Dec 2021 at 15:02, Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:
> >>>>> 
> >>>>> 
> >>>>> 
> >>>>> On 2021/12/13 21:17, Michal Prívozník wrote:
> >>>>>> On 12/11/21 02:58, Qu Wenruo wrote:
> >>>>>>> Hi,
> >>>>>>> 
> >>>>>>> Recently I got my libvirt setup on both RK3399 (RockPro64) and RPI
> >>>>>>> CM4,
> >>>>>>> with upstream kernels.
> >>>>>>> 
> >>>>>>> For RPI CM4 its mostly smooth sail, but on RK3399 due to its
> >>>>>>> little.BIG
> >>>>>>> setup (core 0-3 are 4x A55 cores, and core 4-5 are 2x A72 cores), it
> >>>>>>> brings quite some troubles for VMs.
> >>>>>>> 
> >>>>>>> In short, without proper cpuset to bind the VM to either all A72
> >>>>>>> cores
> >>>>>>> or all A55 cores, the VM will mostly fail to boot.
> >>> 
> >>> s/A55/A53/. There were thankfully no A72+A55 ever produced (just the
> >>> though of it makes me sick).
> >>> 
> >>>>>>> 
> >>>>>>> Currently the working xml is:
> >>>>>>> 
> >>>>>>>      <vcpu placement='static' cpuset='4-5'>2</vcpu>
> >>>>>>>      <cpu mode='host-passthrough' check='none'/>
> >>>>>>> 
> >>>>>>> But even with vcpupin, pinning each vcpu to each physical core, VM
> >>>>>>> will
> >>>>>>> mostly fail to start up due to vcpu initialization failed with
> >>>>>>> -EINVAL.
> >>> 
> >>> Disclaimer: I know nothing about libvirt (and no, I don't want to
> >>> know! ;-).
> >>> 
> >>> However, for things to be reliable, you need to taskset the whole QEMU
> >>> process to the CPU type you intend to use.
> >> 
> >> Yep, that's what I'm doing.
> >> 
> >>> That's because, AFAICT,
> >>> QEMU will snapshot the system registers outside of the vcpu threads,
> >>> and attempt to use the result to configure the actual vcpu threads. If
> >>> they happen to run on different CPU types, the sysregs will differ in
> >>> incompatible ways and an error will be returned. This may or may not
> >>> be a bug, I don't know (I see it as a feature).
> >> 
> >> Then this brings another question.
> >> 
> >> If we can pin each vCPU to each physical core (both little and big),
> >> then as long as the registers are per-vCPU based, it should be able to
> >> pass both big and little cores to the VM.
> >> 
> >> Yeah, I totally understand this screw up the scheduling, but that's at
> >> least what (some insane) users want (just like me).
> >> 
> >>> 
> >>> If you are annoyed with this behaviour, you can always use a different
> >>> VMM that won't care about such difference (crosvm or kvmtool, to name
> >>> a few).
> >> 
> >> Sounds pretty interesting, a new world but without libvirt...
> >> 
> >>> However, the guest will be able to observe the migration from
> >>> one cpu type to another. This may or may not affect your guest's
> >>> behaviour.
> >> 
> >> Not sure if it's possible to pin each vCPU thread to each core, but let
> >> me try.
> >> 
> > 
> > Sure it is, for instance:
> > 
> > <cputune>
> >      <vcpupin vcpu="0" cpuset="1-4,^2"/>
> >      <vcpupin vcpu="1" cpuset="0,1"/>
> >      <vcpupin vcpu="2" cpuset="2,3"/>
> >      <vcpupin vcpu="3" cpuset="0,4"/>
> >      <emulatorpin cpuset="1-3"/>
> >      <iothreadpin iothread="1" cpuset="5,6"/>
> >      <iothreadpin iothread="2" cpuset="7,8"/>
> > </cputune>
> 
> That's what I have already tried before.
> I pinned vcpu 0-6 to physical core 0-6, and still no reliable boot up.
> 
> And that's why I'm asking here.

You are still missing the point of how QEMU works:

- QEMU creates a dummy VM with a single vcpu. This can happen on *any*
  CPU.
- It snapshots the sysregs for this vcpu, and keep them for later
- It then destroy this VM
- QEMU then creates the full VM, with all the vcpus
- Each vcpu gets initialised with the state saved earlier. If any vcpu
  is initialised on a physical CPU of a different type from the one
  that has been used for the dummy VM, you lose, as we cannot restore
  some of the registers such as MIDR_EL1 (and other registers that KVM
  considers as invariant).

To fix this, you need to change QEMU's notion of a template VM, or
change KVM's notion of invariant registers. The former is quite hard,
and the later breaks a ton of things for guests, such as errata
workarounds.

The best workaround is to taskset the QEMU process (and I really mean
the process, not individual threads) to an homogeneous set of CPUs and
be done with it.

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm




[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux