Re: spinlock recursion in aio_complete()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/22/23 23:22, Helge Deller wrote:
It hangs in fs/aio.c:1128, function aio_complete(), in this call:
     spin_lock_irqsave(&ctx->completion_lock, flags);

All code that I found and that obtains ctx->completion_lock disables IRQs.
It is not clear to me how this spinlock can be locked recursively? Is it
sure that the "spinlock recursion" report is correct?

Yes, it seems correct.
[...]

Bart, thanks to your suggestions I was able to narrow down the problem!

I got LOCKDEP working on parisc, which then reports:
	raw_local_irq_restore() called with IRQs enabled
for the spin_unlock_irqrestore() in function aio_complete(), which shouldn't happen.

Finally, I found that parisc's flush_dcache_page() re-enables the IRQs
which leads to the spinlock hang in aio_complete().

So, this is NOT a bug in aio or scsci, but we need fix in the the arch code.


While checking flush_dcache_page() re-enables IRQs, I see on parisc and ARM(32):
flush_dcache_page()  calls:
  -> flush_dcache_mmap_lock()   /  flush_dcache_mmap_unlock()
which uses: xa_lock_irq()	/  xa_unlock_irq()

So, the call to xa_unlock_irq() re-enables the IRQs unconditionally
and triggers the hang in aio_complete().

I temporarily #defined flush_dcache_mmap_lock() to NOP and the kernel booted nicely.

Not sure yet what the best fix is...

Helge




[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux