On Mon, Apr 02, 2018 at 01:32:37PM -0600, Logan Gunthorpe wrote: > > > On 02/04/18 01:16 PM, Jerome Glisse wrote: > > There isn't good API at the moment AFAIK, closest thing would either be > > lookup_resource() or region_intersects(), but a more appropriate one can > > easily be added, code to walk down the tree is readily available. More- > > over this can be optimize like vma lookup are, even more as resource are > > seldomly added so read side (finding a resource) can be heavily favor > > over write side (adding|registering a new resource). > > So someone needs to create a highly optimized tree that registers all > physical address on the system and maps them to devices? That seems a > long way from being realized. I'd hardly characterize that as "easily". > If we can pass both devices to the API I'd suspect it would be preferred > over the complicated tree. This, of course, depends on what users of the > API need. This tree already exist, it is all there upstream see kernel/resource.c What is missing is something that take a single address and return the device struct. There is function that take a range region_intersects() or one that take the start address lookup_resource(). It isn't hard to think that using roughly same code as region_intersects() an helper that return the device for a resource can be added. And yes currently this does not have a pointer back to the device that own the resource but this can be added. It wasn't needed until now. It can latter be optimize if device lookup shows as a bottleneck in perf profile. > > > cache coherency protocol (bit further than PCIE snoop). But also the > > other direction the CPU access to device memory can also be cache coherent, > > which is not the case in PCIE. > > I was not aware that CAPI allows PCI device memory to be cache coherent. > That sounds like it would be very tricky... And yet CAPI, CCIX, Gen-Z, NVLink, ... are all inter-connect that aim at achieving this cache coherency between multiple devices and CPUs. Jérôme