On Tue, Nov 05, 2024 at 11:22:12AM +0100, Thomas Hellström wrote: > On Mon, 2024-11-04 at 15:07 -0800, Matthew Brost wrote: > > > We > > > have > > > https://elixir.bootlin.com/linux/v6.12-rc6/source/include/linux/int > > > erval_tree_generic.h#L24 > > > > > > to relate to. Now GPUVM can't use the generic version since it > > > needs > > > u64 intervals. These trees need unsigned long only so it should be > > > ok. > > > And safe removal, isn't that possible to implement without the > > > list? > > > Then it's really only the linked list as a perf optimization I > > > guess, > > > but we have a lot of those pending... > > > > > > > See my other comments. Let me just follow on using a maple tree and > > perhaps a > > list isn't required if we use that. Will have definite answer in my > > next rev. > > Note, though, that IIRC maple trees do not allow overlapping ranges, > and If we need to support multiple svm VMAs with different offsets, > like Christian suggests, we will likely have overlapping ranges for the > range tree but not for the notifier tree. > I don't think that is how overlapping ranges would look though. We'd have multiple GPU VMAs / GPU ptes to pointing the same SVM range. The SVM ranges speak in the CPU address space - we'd attach multiple GPU VMAs to the SVM so in the notifier we can find all the GPU pages to invalidate. At least I think it would look this way - can cross that bridge if / when we get to it though. > Thinking a bit more about this, my concern is mostly around needlessly > instantiating new interval trees instead of using the generic > instantiation, because that is clearly against recommended practice. > Ok, so with this statement then I think both the interval trees in GPU VM / xe_range_fence are going again the recommended practice too? > But the list could probably be added anyway if needed, and it does > indeed AFAICT reduce the traversal complexity from O(N ln N) to O(N). > I think this will show in the notifier. We currently walk the notifier's range tree twice - once to do the invalidate, once to unmap the pages / add to the garbage collector. I even optimize this so the 2nd walk doesn't have to lookup first range again making the complexity O(ln N + 2 * N) vs (2 * N * ln N) without a list. Matt > /Thomas >