On Wed, Feb 26, 2025 at 12:23:40AM +0900, Alexandre Courbot wrote: > On Wed Feb 26, 2025 at 12:06 AM JST, Danilo Krummrich wrote: > > On Tue, Feb 25, 2025 at 11:11:07PM +0900, Alexandre Courbot wrote: > >> On Mon Feb 24, 2025 at 9:07 PM JST, Danilo Krummrich wrote: > >> > CC: Gary > >> > > >> > On Mon, Feb 24, 2025 at 10:40:00AM +0900, Alexandre Courbot wrote: > >> >> This inability to sleep while we are accessing registers seems very > >> >> constraining to me, if not dangerous. It is pretty common to have > >> >> functions intermingle hardware accesses with other operations that might > >> >> sleep, and this constraint means that in such cases the caller would > >> >> need to perform guard lifetime management manually: > >> >> > >> >> let bar_guard = bar.try_access()?; > >> >> /* do something non-sleeping with bar_guard */ > >> >> drop(bar_guard); > >> >> > >> >> /* do something that might sleep */ > >> >> > >> >> let bar_guard = bar.try_access()?; > >> >> /* do something non-sleeping with bar_guard */ > >> >> drop(bar_guard); > >> >> > >> >> ... > >> >> > >> >> Failure to drop the guard potentially introduces a race condition, which > >> >> will receive no compile-time warning and potentialy not even a runtime > >> >> one unless lockdep is enabled. This problem does not exist with the > >> >> equivalent C code AFAICT, which makes the Rust version actually more > >> >> error-prone and dangerous, the opposite of what we are trying to achieve > >> >> with Rust. Or am I missing something? > >> > > >> > Generally you are right, but you have to see it from a different perspective. > >> > > >> > What you describe is not an issue that comes from the design of the API, but is > >> > a limitation of Rust in the kernel. People are aware of the issue and with klint > >> > [1] there are solutions for that in the pipeline, see also [2] and [3]. > >> > > >> > [1] https://rust-for-linux.com/klint > >> > [2] https://github.com/Rust-for-Linux/klint/blob/trunk/doc/atomic_context.md > >> > [3] https://www.memorysafety.org/blog/gary-guo-klint-rust-tools/ > >> > >> Thanks, I wasn't aware of klint and it looks indeed cool, even if not perfect by > >> its own admission. But even if the ignore the safety issue, the other one > >> (ergonomics) is still there. > >> > >> Basically this way of accessing registers imposes quite a mental burden on its > >> users. It requires a very different (and harsher) discipline than when writing > >> the same code in C > > > > We need similar solutions in C too, see drm_dev_enter() / drm_dev_exit() and > > drm_dev_unplug(). > > Granted, but the use of these is much more coarsed-grained than what is > expected of IO resources, right? Potentially, yes. But exactly this characteristic has been criticised [1]. [1] https://lore.kernel.org/nouveau/Z7XVfnnrRKrtQbB6@phenom.ffwll.local/ > > > > >> and the correct granularity to use is unclear to me. > >> > >> For instance, if I want to do the equivalent of Nouveau's nvkm_usec() to poll a > >> particular register in a busy loop, should I call try_access() once before the > >> loop? Or every time before accessing the register? > > > > I think we should re-acquire the guard in each iteration and drop it before the > > delay. I think a simple closure would work very well for this pattern? > > > >> I'm afraid having to check > >> that the resource is still alive before accessing any register is going to > >> become tedious very quickly. > >> > >> I understand that we want to protect against accessing the IO region of an > >> unplugged device ; but still there is no guarantee that the device won't be > >> unplugged in the middle of a critical section, however short. Thus the driver > >> code should be able to recognize that the device has fallen off the bus when it > >> e.g. gets a bunch of 0xff instead of a valid value. So do we really need to > >> extra protection that AFAICT isn't used in C? > > > > As mentioned above, we already do similar things in C. > > > > Also, think about what's the alternative. If we remove the Devres wrapper of > > pci::Bar, we lose the control over the lifetime of the bar mapping and it can > > easily out-live the device / driver binding. This makes the API unsound. > > Oh my issue is not with the Devres wrapper, I think it makes sense - > it's more the use of RCU to control access to the resource that I find > too constraining. And I'm pretty sure there will be more users of the > same opinion as more drivers using it get written. What do you suggest? > > > > > With this drivers would be able to keep resources acquired. What if after a > > hotplug the physical address region is re-used and to be mapped by another > > driver? > > Actually - wouldn't that issue also be addressed by a PCI equivalent to > drm_dev_enter() and friends that ensures the device (and thus its > devres resources) stay in place? I'm not sure I get the idea, but we can *not* have the device resources stay in place once the device is unbound (e.g. keep the resource region acquired by the driver). Consequently, we have to have a way to revoke access to the corresponding pci::Bar. > > Using Rust, I can imagine (but not picture precisely yet) some method of > the device that returns a reference to an inner structure containing its > resources, available with immediate access. Since it would be > coarser-grained, it could rely on something less constraining than RCU > without a noticeable performance penalty. We had similar attempts when we designed this API, i.e. a common Revocable in the driver private data of a device. But it had some chicken-egg issues on initialization in probe(). Besides that, it wouldn't get you rid of the Revocable, since the corresponding resource are only valid while the driver is bound to a device, not for the entire lifetime of the device.