On Fri, Oct 21, 2022 at 02:08:02PM +0200, Niklas Schnelle wrote: > On Thu, 2022-10-20 at 08:05 -0300, Jason Gunthorpe wrote: > > On Thu, Oct 20, 2022 at 10:51:10AM +0200, Niklas Schnelle wrote: > > > > > Ok that makes sense thanks for the explanation. So yes my assessment is > > > still that in this situation the IOTLB flush is architected to return > > > an error that we can ignore. Not the most elegant I admit but at least > > > it's simple. Alternatively I guess we could use call_rcu() to do the > > > zpci_unregister_ioat() but I'm not sure how to then make sure that a > > > subsequent zpci_register_ioat() only happens after that without adding > > > too much more logic. > > > > This won't work either as the domain could have been freed before the > > call_rcu() happens, the domain needs to be detached synchronously > > > > Jason > > Yeah right, that is basically the same issue I was thinking of for a > subsequent zpci_register_ioat(). What about the obvious one. Just call > synchronize_rcu() before zpci_unregister_ioat()? Ah, it can be done, but be prepared to wait >> 1s for synchronize_rcu to complete in some cases. What you have seems like it could be OK, just deal with the ugly racy failure Jason