On Thu, 2022-03-17 at 10:05 +0100, Igor Mammedov wrote: > re-sending reply as something went wrong with headers (I suspect Daniel's name formatting) > and email got bounced back. > > On Wed, 16 Mar 2022 14:31:33 +0000 > David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote: > > > On Wed, 2022-03-16 at 12:28 +0100, Igor Mammedov wrote: > > > Generally Daniel is right, as long as it's something that what real hardware > > > supports. (usually it's job if upper layers which know what guest OS is used, > > > and can tweak config based on that knowledge). > > > > > > But it's virt only extension and none (tested with > > > Windows (hangs on boot), > > > Linux (brings up only first 255 cpus) > > > ) of mainline OSes ended up up working as expected (i.e. user asked for this > > > many CPUs but can't really use them as expected). > > > > As I said, that kind of failure mode will happen even with the split > > irq chip and EXT_DEST_ID, with Windows and older (pre-5.10) Linux > > kernels. > > > > For older guests it would also happen on real hardware, and in VMs > > where you expose an IOMMU with interrupt remapping. Some guests don't > > support interrupt remapping, or don't support X2APIC at all. > > > > > Which would just lead to users reporting (obscure) bugs. > > > > It's not virt only. This could happen with real hardware. > > I was talking about EXT_DEST_ID kvm extension. Then I'm confused, because that isn't the conversation we were having before. And in that case what you say above about Linux only bringing up the first 255 CPUs directly contradicts what you say below, my own experience, and the whole *point* of the EXT_DEST_ID extension :) Let's start again. You observed that Linux guests fail to bring up >254 vcPUs if qemu doesn't enable the EXT_DEST_ID support, and your fix was to require EXT_DEST_ID (as a side-effect of requiring split irqchip). This reminded me of the fixes I'd been posting since 2020 which still haven't been merged, so I dusted those off and resent them. I didn't incorporate your change, and objected to your patch because I think it's pointless babysitting. Yes, in the general case if you want your guest to use more than 254 vCPUs you need to take a moment to think about precisely what your guest operating system requires in order to support that. At the very least it needs X2APIC support, and then you need *one* of: • EXT_DEST_ID, • Interrupt remapping, or • just using those vCPUs without external interrupts. Both of the first two require the split irqchip, so your patch just doesn't let users rely on that last option. I conceded (cited below) because I don't know of any existing guest OS which does use that last option. I'd attempted to make Linux do so, but eventually abandoned it: https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/irqaffinity But now you seem to be making a different argument? > > Anyway, as far as I'm concerned it doesn't matter very much whether we > > insist on the split irq chip or not. Feel free to repost your patch > > rebased on top of my fixes, which are also in my tree at > > https://git.infradead.org/users/dwmw2/qemu.git > > > > > > The check you're modifying has moved to x86_cpus_init(). > > if we are to keep iommu dependency then moving to x86_cpus_init() > isn't an option, it should be done at pc_machine_done() time. Thus far, I didn't think anyone had been talking about a dependency on IOMMU. That doesn't make any sense at all. EXT_DEST_ID is perfectly sufficient for Linux kernels from 5.10 onwards and they don't need the IOMMU. So no. Post your patch to s/kvm_irqchip_in_kernel/kvm_irqchip_is_split/ in x86_cpus_init() by all means; I don't care much about that and I've even incorporated that change into my tree at https://git.infradead.org/users/dwmw2/qemu.git so that if I do have to repost these fixes yet again, it'll be included. But let's not re-add the IOMMU dependency. That would be wrong.
Attachment:
smime.p7s
Description: S/MIME cryptographic signature