On Mon, 2023-03-27 at 19:50 -0700, Sathyanarayanan Kuppuswamy wrote: > Hi Kai, > > On 3/27/23 7:38 PM, Huang, Kai wrote: > > > +/* Reserve an IRQ from x86_vector_domain for TD event notification */ > > > +static int __init tdx_event_irq_init(void) > > > +{ > > > + struct irq_alloc_info info; > > > + cpumask_t saved_cpumask; > > > + struct irq_cfg *cfg; > > > + int cpu, irq; > > > + > > > + if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) > > > + return 0; > > > + > > > + init_irq_alloc_info(&info, NULL); > > > + > > > + /* > > > + * Event notification vector will be delivered to the CPU > > > + * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested. > > > + * So set the IRQ affinity to the current CPU. > > > + */ > > > + cpu = get_cpu(); > > > + cpumask_copy(&saved_cpumask, current->cpus_ptr); > > > + info.mask = cpumask_of(cpu); > > > + put_cpu(); > > The 'saved_cpumask' related code is ugly. If you move put_cpu() to the end of > > this function, I think you can remove all related code: > > > > cpu = get_cpu(); > > > > /* > > * Set @info->mask to local cpu to make sure a valid vector is > > * pre-allocated when TDX event notification IRQ is allocated > > * from x86_vector_domain. > > */ > > init_irq_alloc_info(&info, cpumask_of(cpu)); > > > > // rest staff: request_irq(), hypercall ... > > > > put_cpu(); > > > > init_irq_alloc_info() is a sleeping function. Since get_cpu() disables > preemption, we cannot call sleeping function after it. Initially, I > have implemented it like you have mentioned. However, I discovered the > following error. Oh sorry I forgot this. So I think we should use migrate_disable() instead: migrate_disable(); init_irq_alloc_info(&info, cpumask_of(smp_processor_id())); ... migrate_enable(); Or, should we just use early_initcall() so that only BSP is running? IMHO it's OK to always allocate the vector from BSP. Anyway, either way is fine to me. > > [ 2.400755] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 > [ 2.404664] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 > [ 2.408671] preempt_count: 1, expected: 0 > [ 2.412650] RCU nest depth: 0, expected: 0 > [ 2.412666] no locks held by swapper/0/1. > [ 2.416650] Preemption disabled at: > [ 2.416650] [<ffffffff83b8089f>] tdx_arch_init+0x38/0x117 > [ 2.420670] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc4-00117-g672ca073d9f9-dirty #2527 > [ 2.424650] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 > [ 2.424650] Call Trace: > [ 2.424650] <TASK> > [ 2.424650] dump_stack_lvl+0x6a/0x86 > [ 2.424650] __might_resched.cold+0xf4/0x12f > [ 2.424650] __mutex_lock+0x50/0x810 > [ 2.424650] ? lock_is_held_type+0xd8/0x130 > [ 2.424650] ? __irq_alloc_descs+0xcf/0x310 > [ 2.424650] ? find_held_lock+0x2b/0x80 > [ 2.424650] ? __irq_alloc_descs+0xcf/0x310 > [ 2.424650] __irq_alloc_descs+0xcf/0x310 > [ 2.424650] irq_domain_alloc_descs.part.0+0x49/0xa0 > [ 2.424650] __irq_domain_alloc_irqs+0x2a0/0x4f0 > [ 2.424650] ? next_arg+0x129/0x1f0 > [ 2.424650] ? tdx_guest_init+0x5b/0x5b > [ 2.424650] tdx_arch_init+0x8e/0x117 > [ 2.424650] do_one_initcall+0x137/0x2ec > [ 2.424650] ? rcu_read_lock_sched_held+0x36/0x60 > [ 2.424650] kernel_init_freeable+0x1e3/0x241 > [ 2.424650] ? rest_init+0x1a0/0x1a0 > [ 2.424650] kernel_init+0x17/0x170 > [ 2.424650] ? rest_init+0x1a0/0x1a0 > [ 2.424650] ret_from_fork+0x1f/0x30 > [ 2.424650] </TASK>