On Fri, Feb 05, 2021 at 03:42:01PM -0500, Andrey Grodzovsky wrote: > On 2/5/21 2:45 PM, Bjorn Helgaas wrote: > > On Fri, Feb 05, 2021 at 11:08:45AM -0500, Andrey Grodzovsky wrote: > > > > > > For user mappings, including MMIO mappings, we have a reliable > > > approach where we invalidate device address space mappings for all > > > user on first sign of device disconnect and then on all subsequent > > > page faults from the users accessing those ranges we insert dummy > > > zero page into their respective page tables. It's actually the > > > kernel driver, where no page faulting can be used such as for user > > > space, I have issues on how to protect from keep accessing those > > > ranges which already are released by PCI subsystem and hence can be > > > allocated to another hot plugging device. > > > > That doesn't sound reliable to me, but maybe I don't understand what > > you mean by the "first sign of device disconnect." > > See functions drm_dev_enter, drm_dev_exit and drm_dev_unplug in drm_derv.c > > > At least from a PCI > > perspective, the first sign of a surprise hot unplug is likely to be > > an MMIO read that returns ~0. > > We set drm_dev_unplug in amdgpu_pci_remove and base all later checks > with drm_dev_enter/drm_dev_exit on this It sounds like you are talking about an orderly notified unplug rather than a surprise hot unplug. If it's a surprise, the code doesn't get to fence off future MMIO access until well after the address range is already unreachable.