Am 21.07.23 um 12:33 schrieb Asahi Lina:
[SNIP]
I've already tried to explain the issue. The DRM scheduler design, as
it stands today, makes it impractical to write a safe Rust abstraction
for it. This is a fact. Christian has repeatedly NAKed my attempts at
changing it to make such a safe abstraction possible, and is clearly
opposed to the fundamental lifetime requirements change I am trying to
make here. Therefore, we are at an impasse.
It's unfortunate, but given this situation, the DRM scheduler will not
be available to Rust DRM drivers. I thought the goal of the DRM
subsystem common code was to cater to multiple drivers and usage
approaches, with an emphasis on doing things "right" and avoiding
design issues that are common mistakes in driver design. Clearly, this
is not the case for all of DRM, at least not the DRM scheduler.
In software engineering, complex lifetime requirements are an
anti-pattern, which is one reason why Rust draws a line between safe
and unsafe APIs. For a safe API, it is up to the API developer to
design it such that it cannot be misused in a way that causes memory
safety issues, and the only lifetime requirements it can impose are
those that can be expressed in the Rust type system and statically
checked at compile time. The DRM scheduler's complex chain of lifetime
requirements cannot, and wrapping it in enough glue to remove those
lifetime requirements would require about as much code as just
rewriting it, as well as add unacceptable duplication and overhead.
In kernel Rust, we strive to only have safe APIs for components which
have no fundamental reason for being unsafe (things like direct memory
mapping and raw hardware access). The DRM scheduler does not fall into
one of those "inherently unsafe" categories, so it doesn't make sense
to provide a raw unsafe API for it.
This is not completely correct. The DRM scheduler provides a dma_fence
interface as wrapper around hardware dma_fence interfaces.
This interface is made to allow core Linux memory management to query
the progress of hardware operations.
So you are working with something very low level here and have to follow
restrictions which Rust can't enforce from the language because it
simply can't know about that at compile time.
Doing so would just expose Rust code to the kind of subtle safety
implications that cause memory problems every day in C. If I were to
use drm_sched's unsafe API "as is" in my driver, it would *by far* be
the least auditable, most complex usage of unsafe code in the entire
driver, and I have no confidence that I would be able to get it right
and keep it right as the driver changes.
I don't see a reason why this cannot be simply fixed in drm_sched, but
Christian disagrees, and has repeatedly (and strongly) NAKed my
attempts at doing so, and indeed NAKed the entire premise of the
change in lifetime requirements itself. So here we are. There is no
solution that will work for everyone that involves drm_sched.
So I don't have a choice other than to just not use drm_sched and roll
my own. I will happily move that Rust implementation to common code if
another Rust driver comes along and wants to use it. I'm not sure if
that kind of thing can be easily made available to C drivers in
reverse, but I guess we'll cross that bridge when a C driver expresses
interest in using it.
Well, to make it clear once more: Signaling a dma_fence from the
destructor of a reference counted object is very problematic! This will
be rejected no matter if you do that in C or in Rust.
What we can do is to make it safe in the sense that you don't access
freed up memory by using the scheduler fences even more as wrapper
around the hardware fence as we do now. But this quite a change and
requires a bit more than just hacking around
drm_sched_fence_get_timeline_name().
So far it seems existing C drivers are happy with drm_sched's design
and complex lifetime requirements, even though they aren't even
documented. Perhaps they managed to reverse engineer them from the
source, or someone told the authors about it (certainly nobody told me
when I started using drm_sched). Or maybe a bunch of the drm_scheduler
users are actually broken and have memory safety issues in corner
cases. I don't know, though if I had to bet, I'd bet on the second
option.
Actually, let's do a quick search and find out!
panfrost_remove() -> panfrost_device_fini() -> panfrost_job_fini() ->
drm_sched_fini()
There is a direct, synchronous path between device removal and
destroying the DRM scheduler. At no point does it wait for any jobs to
complete. That's what patch #3 in this series tries to fix.
In fact, it doesn't even keep the entities alive! It calls
drm_dev_unregister() before everything else, but as the docs for the
DRM driver lifetimes clearly say (see, docs!), objects visible to
userspace must survive that and only be released from the release
callback. drm_sched entities are created/destroyed from
panfrost_job_open()/panfrost_job_close(), which are called from
panfrost_open() and panfrost_postclose(), which are the userspace file
open/close functions.
That one I fix in the Rust abstraction already (that one's relatively
easy to fix), so it doesn't need a drm_sched patch from my point of
view, but it is yet another subtle, undocumented lifetime requirement
that is, evidently, impossible to know about and get right without
documentation.
Meanwhile, panfrost_fence_ops has no remove() callback, which means
there is no reference path stopping device removal (and therefore
scheduler teardown) or even module unload while driver fences are
still alive. That leads to the UAF patch #2 in this series tries to fix.
In other words, as far as I can tell, the panfrost driver gets
*everything* wrong when it comes to the DRM scheduler lifetime
requirements, and will crash and burn if the driver is unbound while
jobs are in flight, anyone has a file descriptor open at all, or even
if any shared buffer holding a DRM scheduler fence from it is bound to
the display controller at that time.
Yeah, that is perfectly correct what you wrote.
Daniel and I have gone back and forth multiple times how we should
design this and we opted to not use direct pointers for the contexts
because that allows for simpler driver implementations.
The DRM scheduler doesn't document the lifetime requirements because it
doesn't define the lifetime requirements. From the design the DRM
scheduler is supposed to be an component wrapping around DMA fences. And
those DMA fences have the necessary lifetime definition.
Now DMA fences have their live cycles explained in the structure
documentation, but it doesn't really talk much about the requirements
for dma_fence_ops implementations. We should probably improve that.
So yes, drivers need to keep the structures which might be accessed by
userspace alive even after the underlying device is removed. But
signaling dma_fences is completely independent from that.
This is why this kind of design is not allowed in Rust.
Well it is allowed, it's just not safe.
Regards,
Christian.
Because nobody gets it right. *Especially* not without docs. I
assumed, like the authors of the Panfrost driver clearly assumed, that
the DRM scheduler API would not have these crazy undocumented hidden
requirements. The only reason I found out the hard way is I happen to
create and destroy schedulers all the time, not just once globally, so
I would hit the bugs and dangling pointers much more often than
Panfrost users who, most likely, never unbind their devices. But we
both have the same problem.
I think I've done all I can to explain the issues and try to fix them,
so the ball is in your court now. If you want to keep the current
design, that's fine, but Rust code will not be a drm_sched user in
that case. And the rest of the DRM folks in the C world will have to
contend with these issues and fix all the problems in the C drivers
(I'm sure panfrost isn't the only one, it's just literally the first
one I looked at).
As for me, I'm happy to write a simple workqueue-based Rust scheduler
suitable for firmware-managed scheduler devices. Honestly, at this
point, I have very little faith left in my ability to fix all these
issues in drm_sched (I know there's at least one more lurking, I saw a
NULL deref but I wasn't able to reproduce it nor divine how it
possibly happened). That, combined with the hostility from the AMD
folks about my attempts to improve drm_sched even just a little bit,
makes that decision very easy.
Farewell, DRM scheduler. It was nice trying to work with you, but
things just didn't work out. I won't be submitting a v2 to this series
myself. Please ping me if you fix all these fundamental design issues
with drm_sched and think I might actually be able to use it safely in
Rust one day. If the API design is solid and safe and the
implementation done in a way that inspires confidence at that time
maybe we can yank out my Rust solution when the time comes and switch
back to drm_sched.
Just please don't expect me to do the work any more, I've done
everything I can and this now has to come from you, not me. I've spent
way more time understanding drm_sched, auditing its code,
understanding its design intent, trying to fix it, and getting yelled
at for it than it would take to write a new, clean, safe Rust
scheduler. I don't regret some of the time spent (some of the
implementation details of drm_sched have taught me useful things), but
I'm not prepared to spend any more, sorry.
~~ Lina