On Tue, 2023-04-04 at 18:02 -0700, Sathyanarayanan Kuppuswamy wrote: > > On 3/27/23 9:02 PM, Huang, Kai wrote: > > On Mon, 2023-03-27 at 19:50 -0700, Sathyanarayanan Kuppuswamy wrote: > > > Hi Kai, > > > > > > On 3/27/23 7:38 PM, Huang, Kai wrote: > > > > > +/* Reserve an IRQ from x86_vector_domain for TD event notification */ > > > > > +static int __init tdx_event_irq_init(void) > > > > > +{ > > > > > + struct irq_alloc_info info; > > > > > + cpumask_t saved_cpumask; > > > > > + struct irq_cfg *cfg; > > > > > + int cpu, irq; > > > > > + > > > > > + if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) > > > > > + return 0; > > > > > + > > > > > + init_irq_alloc_info(&info, NULL); > > > > > + > > > > > + /* > > > > > + * Event notification vector will be delivered to the CPU > > > > > + * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested. > > > > > + * So set the IRQ affinity to the current CPU. > > > > > + */ > > > > > + cpu = get_cpu(); > > > > > + cpumask_copy(&saved_cpumask, current->cpus_ptr); > > > > > + info.mask = cpumask_of(cpu); > > > > > + put_cpu(); > > > > The 'saved_cpumask' related code is ugly. If you move put_cpu() to the end of > > > > this function, I think you can remove all related code: > > > > > > > > cpu = get_cpu(); > > > > > > > > /* > > > > * Set @info->mask to local cpu to make sure a valid vector is > > > > * pre-allocated when TDX event notification IRQ is allocated > > > > * from x86_vector_domain. > > > > */ > > > > init_irq_alloc_info(&info, cpumask_of(cpu)); > > > > > > > > // rest staff: request_irq(), hypercall ... > > > > > > > > put_cpu(); > > > > > > > > > > init_irq_alloc_info() is a sleeping function. Since get_cpu() disables > > > preemption, we cannot call sleeping function after it. Initially, I > > > have implemented it like you have mentioned. However, I discovered the > > > following error. > > > > Oh sorry I forgot this. So I think we should use migrate_disable() instead: > > > > migrate_disable(); > > > > init_irq_alloc_info(&info, cpumask_of(smp_processor_id())); > > > > ... > > > > migrate_enable(); > > > > Or, should we just use early_initcall() so that only BSP is running? IMHO it's > > OK to always allocate the vector from BSP. > > > > Anyway, either way is fine to me. > > Final version looks like below. > > static int __init tdx_event_irq_init(void) > { > struct irq_alloc_info info; > struct irq_cfg *cfg; > int irq; > > if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) > return 0; > > init_irq_alloc_info(&info, NULL); > > /* > * Event notification vector will be delivered to the CPU > * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested. > * So set the IRQ affinity to the current CPU. > */ > info.mask = cpumask_of(0); > > irq = irq_domain_alloc_irqs(x86_vector_domain, 1, cpu_to_node(0), &info); > if (irq <= 0) { > pr_err("Event notification IRQ allocation failed %d\n", irq); > return -EIO; > } > > irq_set_handler(irq, handle_edge_irq); > > /* Since the IRQ affinity is set, it cannot be balanced */ > if (request_irq(irq, tdx_event_irq_handler, IRQF_NOBALANCING, > "tdx_event_irq", NULL)) { > pr_err("Event notification IRQ request failed\n"); > goto err_free_domain_irqs; > } > > cfg = irq_cfg(irq); > > /* > * Since tdx_event_irq_init() is triggered via early_initcall(), > * it will called before secondary CPUs bringup. Since there is > * only one CPU, it complies with the requirement of executing > * the TDVMCALL_SETUP_NOTIFY_INTR hypercall on the same CPU where > * the IRQ vector is allocated. > * > * Register callback vector address with VMM. More details > * about the ABI can be found in TDX Guest-Host-Communication > * Interface (GHCI), sec titled > * "TDG.VP.VMCALL<SetupEventNotifyInterrupt>". > */ > if (_tdx_hypercall(TDVMCALL_SETUP_NOTIFY_INTR, cfg->vector, 0, 0, 0)) { > pr_err("Event notification hypercall failed\n"); > goto err_free_irqs; > } > > tdx_event_irq = irq; > > return 0; > > err_free_irqs: > free_irq(irq, NULL); > err_free_domain_irqs: > irq_domain_free_irqs(irq, 1); > > return -EIO; > } > early_initcall(tdx_event_irq_init) I found there's another series also doing similar thing, and it seems Thomas wasn't happy about using x86_vector_domain directly: https://lore.kernel.org/lkml/877cv99k0y.ffs@tglx/ An alternative was also posted (creating IRQ domain on top of x86_vector_domain): https://lore.kernel.org/lkml/20230328182933.GA1403032@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ I think we should monitor that and hear from others more.