On Thu, Jan 30, 2025 at 05:09:44PM +0100, Simona Vetter wrote: > > You could also use an integer instead of a pointer to indicate the > > cluster of interconnect, I think there are many options.. > > Hm yeah I guess an integer allocater of the atomic_inc kind plus "surely > 32bit is enough" could work. But I don't think it's needed, if we can > reliable just unregister the entire dev_pagemap and then just set up a new > one. Plus that avoids thinking about which barriers we might need where > exactly all over mm code that looks at the owner field. IMHO that is the best answer if it works for the driver. > > ? It is supposed to work, it blocks until all the pages are freed, but > > AFAIK ther is no fundamental life time issue. The driver is > > responsible to free all its usage. > > Hm I looked at it again, and I guess with the fixes to make migration to > system memory work reliable in Matt Brost's latest series it should indeed > work reliable. The devm_ version still freaks me out because of how easily > people misuse these for things that are memory allocations. I also don't like the devm stuff, especially in costly places like this. Oh well. > > > An optional callback is a lot less scary to me here (or redoing > > > hmm_range_fault or whacking the migration helpers a few times) looks a lot > > > less scary than making pgmap->owner mutable in some fashion. > > > > It extra for every single 4k page on every user :\ > > > > And what are you going to do better inside this callback? > > Having more comfy illusions :-P Exactly! > Slightly more seriously, I can grab some locks and make life easier, Yes, but then see my concern about performance again. Now you are locking/unlocking every 4k? And then it still races since it can change after hmm_range_fault returns. That's not small, and not really better. > whereas sprinkling locking or even barriers over pgmap->owner in core mm > is not going to fly. Plus more flexibility, e.g. when the interconnect > doesn't work for atomics or some other funny reason it only works for some > of the traffic, where you need to more dynamically decide where memory is > ok to sit. Sure, an asymmetric mess could be problematic, and we might need more later, but lets get to that first.. > Or checking p2pdma connectivity and all that stuff. Should be done in the core code, don't want drivers open coding this stuff. > Also note that fundamentally you can't protect against the hotunplug or > driver unload case for hardware access. So some access will go to nowhere > when that happens, until we've torn down all the mappings and migrated > memory out. I think a literal (safe!) hot unplug must always use the page map revoke, and that should be safely locked between hmm_range_fault and the notifier. If the underlying fabric is loosing operations during an unplanned hot unplug I expect things will need resets to recover.. Jason