On Fri, May 03, 2024 at 12:44:09PM -0700, Tomasz Jeznach wrote: > > For detach I think yes: > > > > Inv CPU Detach CPU > > > > write io_pte Update device descriptor > > rcu_read_lock > > list_for_each > > <make invalidation command> <make description inv cmd> > > dma_wmb() dma_wmb() > > <doorbell> <cmd doorbell> > > rcu_read_unlock > > list_del_rcu() > > <wipe ASID> > > > > In this case I think we never miss an invalidation, the list_del is > > always after the HW has been fully fenced, so I don't think we can > > have any issue. Maybe a suprious invalidation if the ASID gets > > re-used, but who cares. > > > > Attach is different.. > > > > Inv CPU Attach CPU > > > > write io_pte > > rcu_read_lock > > list_for_each // empty > > list_add_rcu() > > Update device descriptor > > <make description inv cmd> > > dma_wmb() > > <cmd doorbell> > > rcu_read_unlock > > > > As above shows we can "miss" an invalidation. The issue is narrow, the > > io_pte could still be sitting in write buffers in "Inv CPU" and not > > yet globally visiable. "Attach CPU" could get the device descriptor > > installed in the IOMMU and the IOMMU could walk an io_pte that is in > > the old state. Effectively this is because there is no release/acquire > > barrier passing the io_pte store from the Inv CPU to the Attach CPU to the > > IOMMU. > > > > It seems like it should be solvable somehow: > > 1) Inv CPU releases all the io ptes > > 2) Attach CPU acquires the io ptes before updating the DDT > > 3) Inv CPU acquires the RCU list in such a way that either attach > > CPU will acquire the io_pte or inv CPU will acquire the RCU list. > > 4) Either invalidation works or we release the new iopte to the SMMU > > and don't need it. > > > > But #3 is a really weird statement. smb_mb() on both sides may do the > > job?? > > > > Actual attach sequence is slightly different. > > Inv CPU Attach CPU > > write io_pte > rcu_read_lock > list_for_each // empty > list_add_rcu() > IOTLB.INVAL(PSCID) > <make description inv cmd> > dma_wmb() > <cmd doorbell> > rcu_read_unlock > > I've tried to cover this case with riscv_iommu_iotlb_inval() called > before the attached domain is visible to the device. That invalidation shouldn't do anything. If this is the first attach of a PSCID then the PSCID had better already be empty, it won't become non-empty until the DDT entry is installed. And if it is the second attach then the Inv CPU is already taking care of things, no need to invalidate at all. Regardless, there is still a theortical race that the IOPTEs haven't been made visible yet because there is still no synchronization with the CPU writing them. So, I don't think this solves any problem. I belive you need the appropriate kind of CPU barrier here instead of an invalidation. Jason