On Wed, Sep 11, 2024 at 02:53:31PM +1000, Alistair Popple wrote: > > Matthew Brost <matthew.brost@xxxxxxxxx> writes: > > > Avoid multiple CPU page faults to the same device page racing by locking > > the page in do_swap_page before taking an additional reference to the > > page. This prevents scenarios where multiple CPU page faults each take > > an extra reference to a device page, which could abort migration in > > folio_migrate_mapping. With the device page locked in do_swap_page, the > > migrate_vma_* functions need to be updated to avoid locking the > > fault_page argument. > > I added the get_page() and therefore the fault_page argument to deal > with another lifetime issue. Neither Felix nor I particularly liked the > solution though (see > https://lore.kernel.org/all/cover.60659b549d8509ddecafad4f498ee7f03bb23c69.1664366292.git-series.apopple@xxxxxxxxxx/T/#m715589d57716448386ff9af41da27a952d94615a) > and this seems to make it even uglier, so I have suggested an > alternative solution below. > Thanks for the ref, will read this through. > > Prior to this change, a livelock scenario could occur in Xe's (Intel GPU > > DRM driver) SVM implementation if enough threads faulted the same device > > page. > > I haven't seen the same in the NVIDIA UVM driver (out-of-tree, I know) Still a driver. > but theoretically it seems like it should be possible. However we > serialize migrations of the same virtual address range to avoid these > kind of issues as they can happen the other way too (ie. multiple > threads trying to migrate to GPU). > > So I suspect what happens in UVM is that one thread wins and installs > the migration entry while the others fail to get the driver migration > lock and bail out sufficiently early in the fault path to avoid the > live-lock. > I had to try hard to show this, doubt an actual user could trigger this. I wrote a test which kicked 8 threads, each thread did a pthread join, and then tried to read the same page. This repeats in loop for like 512 pages or something. I needed an exclusive lock in migrate_to_ram vfunc for it to livelock. Without an exclusive lock I think on average I saw about 32k retries (i.e. migrate_to_ram calls on the same page) before a thread won this race.