On Tue, Jul 23, 2024 at 10:30:08AM -0500, Lucas De Marchi wrote: > On Tue, Jul 23, 2024 at 09:03:25AM GMT, Tvrtko Ursulin wrote: > > > > On 22/07/2024 22:06, Lucas De Marchi wrote: > > > Instead of calling perf_pmu_unregister() when unbinding, defer that to > > > the destruction of i915 object. Since perf itself holds a reference in > > > the event, this only happens when all events are gone, which guarantees > > > i915 is not unregistering the pmu with live events. > > > > > > Previously, running the following sequence would crash the system after > > > ~2 tries: > > > > > > 1) bind device to i915 > > > 2) wait events to show up on sysfs > > > 3) start perf stat -I 1000 -e i915/rcs0-busy/ > > > 4) unbind driver > > > 5) kill perf > > > > > > Most of the time this crashes in perf_pmu_disable() while accessing the > > > percpu pmu_disable_count. This happens because perf_pmu_unregister() > > > destroys it with free_percpu(pmu->pmu_disable_count). > > > > > > With a lazy unbind, the pmu is only unregistered after (5) as opposed to > > > after (4). The downside is that if a new bind operation is attempted for > > > the same device/driver without killing the perf process, i915 will fail > > > to register the pmu (but still load successfully). This seems better > > > than completely crashing the system. > > > > So effectively allows unbind to succeed without fully unbinding the > > driver from the device? That sounds like a significant drawback and if > > so, I wonder if a more complicated solution wouldn't be better after > > all. Or is there precedence for allowing userspace keeping their paws on > > unbound devices in this way? > > keeping the resources alive but "unplunged" while the hardware > disappeared is a common thing to do... it's the whole point of the > drmm-managed resource for example. If you bind the driver and then > unbind it while userspace is holding a ref, next time you try to bind it > will come up with a different card number. A similar thing that could be > done is to adjust the name of the event - currently we add the mangled > pci slot. > > That said, I agree a better approach would be to allow > perf_pmu_unregister() to do its job even when there are open events. On > top of that (or as a way to help achieve that), make perf core replace > the callbacks with stubs when pmu is unregistered - that would even kill > the need for i915's checks on pmu->closed (and fix the lack thereof in > other drivers). > > It can be a can of worms though and may be pushed back by perf core > maintainers, so it'd be good have their feedback. I don't think I understand the problem. I also don't understand drivers much -- so that might be the problem. So the problem appears to be that the device just disappears without warning? How can a GPU go away like that? Since you have a notion of this device, can't you do this stubbing you talk about? That is, if your internal device reference becomes NULL, let the PMU methods preserve the state like no-ops. And then when the last event goes away, tear down the whole thing. Again, I'm not sure I'm following.