On Tue, 11 Feb 2025 19:21:54 +0000, Frank Li <Frank.Li@xxxxxxx> wrote: > > The follow steps trigger kernel dump warning and > platform_device_msi_init_and_alloc_irqs() return false. > > 1: platform_device_msi_init_and_alloc_irqs(); > 2: platform_device_msi_free_irqs_all(); > 3: platform_device_msi_init_and_alloc_irqs(); > > [ 76.713677] WARNING: CPU: 3 PID: 134 at kernel/irq/msi.c:1028 msi_create_device_irq_domain+0x1bc/0x22c > [ 76.723010] Modules linked in: > [ 76.726082] CPU: 3 UID: 0 PID: 134 Comm: kworker/3:1H Not tainted 6.13.0-rc1-00015-gd60b98003b43-dirty #57 > [ 76.735741] Hardware name: NXP i.MX95 19X19 board (DT) > [ 76.740883] Workqueue: kpcitest pci_epf_test_cmd_handler > [ 76.746212] pstate: a0400009 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 76.753172] pc : msi_create_device_irq_domain+0x1bc/0x22c > [ 76.758586] lr : msi_create_device_irq_domain+0x104/0x22c > [ 76.763988] sp : ffff800083f43be0 > [ 76.767313] x29: ffff800083f43be0 x28: 0000000000000000 x27: ffff8000827a7000 > [ 76.774466] x26: ffff00008085f400 x25: ffff00008000b180 x24: ffff000080fc6410 > [ 76.781624] x23: ffff000085704cc0 x22: ffff8000811c8828 x21: ffff000085704cc0 > [ 76.788774] x20: ffff000082814000 x19: 0000000000000000 x18: ffffffffffffffff > [ 76.795933] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 > [ 76.803083] x14: 0000000000000000 x13: 0000000f00000000 x12: 0000000000000000 > [ 76.810233] x11: 0000000000000000 x10: 000000000000002d x9 : ffff800083f43ba0 > [ 76.817383] x8 : 00000000ffffffff x7 : 0000000000000019 x6 : ffff0000857e443a > [ 76.824533] x5 : 0000000000000000 x4 : ffffffffffffffff x3 : ffff000085704ce8 > [ 76.831683] x2 : ffff000080835640 x1 : 0000000000000213 x0 : ffff0000877189c0 > [ 76.838840] Call trace: > [ 76.841287] msi_create_device_irq_domain+0x1bc/0x22c (P) > [ 76.846701] msi_create_device_irq_domain+0x104/0x22c (L) > [ 76.852118] platform_device_msi_init_and_alloc_irqs+0x6c/0xb8 > > Do below two things in platform_device_msi_init_and_alloc_irqs(). > - msi_create_device_irq_domain() > - msi_domain_alloc_irqs_range() > > But only call msi_domain_free_irqs_all() in > platform_device_msi_free_irqs_all(), which missed call > msi_remove_device_irq_domain(). This cause above kernel dump when call > platform_device_msi_init_and_alloc_irqs() again. I don't think this commit message makes much sense, and doesn't explain the essential problem, which is the lack of symmetry. I'd suggest something like: "platform_device_msi_init_and_alloc_irqs() performs two tasks: allocating the MSI domain for a platform device, and allocate a number of MSIs in that domain. platform_device_msi_free_irqs_all() only frees the MSIs, and leaves the MSI domain alive. Given that platform_device_msi_init_and_alloc_irqs() is the sole tool a platform device has to allocate platform MSIs, it would make sense for platform_device_msi_free_irqs_all() to teardown the MSI domain at the same time as the MSIs. This also avoids warnings and unexpected behaviours when a driver repeatedly allocates and frees MSIs." With that: Acked-by: Marc Zyngier <maz@xxxxxxxxxx> M. -- Without deviation from the norm, progress is not possible.