On Thu, Jul 21, 2022 at 07:00:23PM -0400, Felix Kuehling wrote: > Hi all, > > We're noticing some unexpected behaviour when the amdgpu and Mellanox > drivers are interacting on shared memory with hmm_range_fault. If the amdgpu > driver migrated pages to DEVICE_PRIVATE memory, we would expect > hmm_range_fault called by the Mellanox driver to fault them back to system > memory. But that's not happening. Instead hmm_range_fault fails. > > For an experiment, Philip hacked hmm_vma_handle_pte to treat DEVICE_PRIVATE > pages like device_exclusive pages, which gave us the expected behaviour. It > would result in a dev_pagemap_ops.migrate_to_ram callback in our driver, and > hmm_range_fault would return system memory pages to the Mellanox driver. > > So something is clearly wrong. It could be: > > * our expectations are wrong, > * the implementation of hmm_range_fault is wrong, or > * our driver is missing something when migrating to DEVICE_PRIVATE memory. > > Do you have any insights? I think it is a bug Jason