Re: [RFC PATCH 1/5] doc: rust: create safety standard

Daniel Almeida <daniel.almeida@xxxxxxxxxxxxx> · Thu, 8 Aug 2024 10:01:55 -0300

Hi Benno,

> On 24 Jul 2024, at 17:31, Benno Lossin <benno.lossin@xxxxxxxxx> wrote:
> 
> On 19.07.24 18:24, Daniel Almeida wrote:
>> Hi Benno,
>> 
>> It’s nice to see this shaping up. I do agree that it’s a bit of a wild
>> west right now.
>> 
>> IMHO, we need a lint to enforce compliance, unless we plan to have every patch
>> reviewed by the RFL community, which is unrealistic as time goes forward. I
>> myself have forgotten to properly document unsafe blocks because it’s easy
>> to miss things when submitting more than a thousand LOC.
>> 
>> A new clippy lint would make sense here, since we already have clippy support
>> in the kernel anyways.
> 
> I definitely see the potential, but I have no experience writing clippy
> lints. I also have no idea if it can detect comments.
> I also think that a lint solution will be very difficult, since it will
> either have to be a full proof assistant that mathematically checks if
> the safety comments are correct, or we still need human review.
> I think that if people are more familiar with safety comments, it will
> be easier, it's just how one improves at coding.
> 
> I don't want to reject formal verification from the get-go; on the
> contrary, I would like to see it applied more in the kernel. Rust has
> several different implementations, but I haven't yet taken an in-depth
> look at them. However, from my past experience with formal proof
> assistants, I have my doubts that everyday developers will pick them up
> more easily/in favor of just plain safety comments.
> 
> I think that we should apply formal verification to those areas that
> have been shown to be very difficult to get right. We currently do not
> have enough Rust to have such areas, but when we do, it might be a
> powerful tool. But I don't see it becoming the norm for Rust code (but
> maybe I am wrong, and I would be very happy to be wrong in this
> instance).
> 
> There are also several clippy lints [1] that we could start using:
> - missing_safety_doc
> - multiple_unsafe_ops_per_block
> - undocumented_unsafe_blocks
> - unnecessary_safety_comment
> - unnecessary_safety_doc
> 
> I personally think we should enable all of them.
> 
> [1]: https://rust-lang.github.io/rust-clippy/master/index.html#/safety
> 
> What did you expect/wish for with a clippy lint? Is it already present
> or did you want something that verifies your safety comments?

Yeah, I wasn’t referring to formal verification, just a lint that will complain when
it finds an unsafe block that has no safety comments at all. The clippy lints you
listed should work fine and, IIUC, Miguel already has a patch to enable (some of) them,
so I don’t think any further action is needed.

> 
>>> On 17 Jul 2024, at 19:12, Benno Lossin <benno.lossin@xxxxxxxxx> wrote:
>>> 
>>> `unsafe` Rust code in the kernel is required to have safety
>>> documentation. This is to ensure the correctness of `unsafe` code and is
>>> thus very important.
>>> However, at this point in time there does not exist a standard way of
>>> writing safety documentation. This leads to confusion, as authors
>>> struggle to find the right way to convey their desired intentions.
>>> Readers struggle with correctly interpreting the existing documentation.
>>> 
>>> Add the safety standard that will document the meaning of safety
>>> documentation. This first document gives an overview of the problem and
>>> gives general information about the topic.
>>> 
>>> Signed-off-by: Benno Lossin <benno.lossin@xxxxxxxxx>
>>> ---
>>> Documentation/rust/general-information.rst   |   1 +
>>> Documentation/rust/index.rst                 |   1 +
>>> Documentation/rust/safety-standard/index.rst | 246 +++++++++++++++++++
>>> 3 files changed, 248 insertions(+)
>>> create mode 100644 Documentation/rust/safety-standard/index.rst
>>> 
>>> diff --git a/Documentation/rust/general-information.rst b/Documentation/rust/general-information.rst
>>> index e3f388ef4ee4..ddfe4e2e5307 100644
>>> --- a/Documentation/rust/general-information.rst
>>> +++ b/Documentation/rust/general-information.rst
>>> @@ -54,6 +54,7 @@ the same invocation used for compilation, e.g.::
>>> Please note that Clippy may change code generation, thus it should not be
>>> enabled while building a production kernel.
>>> 
>>> +.. _rust-abstractions:
>>> 
>>> Abstractions vs. bindings
>>> -------------------------
>>> diff --git a/Documentation/rust/index.rst b/Documentation/rust/index.rst
>>> index 46d35bd395cf..968e9aace301 100644
>>> --- a/Documentation/rust/index.rst
>>> +++ b/Documentation/rust/index.rst
>>> @@ -39,6 +39,7 @@ configurations.
>>>    quick-start
>>>    general-information
>>>    coding-guidelines
>>> +    safety-standard/index
>>>    arch-support
>>>    testing
>>> 
>>> diff --git a/Documentation/rust/safety-standard/index.rst b/Documentation/rust/safety-standard/index.rst
>>> new file mode 100644
>>> index 000000000000..1cbc8d3dea04
>>> --- /dev/null
>>> +++ b/Documentation/rust/safety-standard/index.rst
>>> @@ -0,0 +1,246 @@
>>> +.. SPDX-License-Identifier: GPL-2.0
>>> +.. highlight:: rust
>>> +
>>> +====================
>>> +Rust Safety Standard
>>> +====================
>>> +
>>> +Safe Rust code cannot have memory related bugs. This is a guarantee by the Rust compiler. Of course
>>> +it is not without caveats: no compiler bugs, no bugs in the specification etc. But the possibly most
>>> +important caveat is that of ``unsafe`` code. ``unsafe`` code needs to follow certain rules in order
>>> +for safe code to enjoy the no-memory-bugs privilege. A simple example of such a rule is that
>>> +references must be valid for the duration of their lifetime. If any rule is violated, it can lead
>>> +to undefined behavior even in safe code! The term undefined behavior in Rust has a lot stricter
>>> +meaning than in C or C++: UB in Rust is totally forbidden. In C one might rely on the compiler
>>> +implementation to ensure correct code generation, but that is not the case for Rust. You can read
>>> +more about UB in Rust
>>> +`here <https://doc.rust-lang.org/reference/behavior-considered-undefined.html>`_.
>>> +
>>> +If ``unsafe`` code makes our life this difficult, one might ask the question "why do we even need
>>> +it?" and the answer to that is that it gives users an escape hatch to do things that the compiler
>>> +normally forbids. ``unsafe`` code is a tool that enables programmers to write more performant code,
>>> +or code that interacts with hardware or C. These things are particularly important in kernel
>>> +development.
>>> +
>>> +The most effective way to prevent issues in ``unsafe`` code is to just not write ``unsafe`` code in
>>> +the first place. That is why minimizing the amount of ``unsafe`` code is very important. For
>>> +example, drivers are not allowed to directly interface with the C side. Instead of directly
>>> +communicating with C functions, they interact with Rust abstractions. This concentrates the usage
>>> +of ``unsafe`` code, making it easy to fix issues, since only the abstraction needs to be fixed.
>>> +Abstractions also allow taking advantage of other Rust language features. Read more in
>>> +:ref:`rust-abstractions`.
>> 
>> This is something that I think we should discuss at Kangrejos. I do not think
>> that we should set in stone that the kernel crate is the only place where
>> unsafe code is acceptable.
> 
> Oh then I need to rephrase the above paragraph, since I don't meant to
> say that. What I want to say is this:
> (1) concentrate as much `unsafe` code as possible, and put it somewhere
>     where everyone can use it (ie the `kernel` crate)
> (2) abstract over common use-patterns of `unsafe` code via safe
>     abstractions
> (3) disallow access to *raw* `bindings::` function calls from drivers.
> 
> From what you write below, I think that we are on the same page for (1)
> and (2). What I want to accomplish with (3) is that we don't have hacky
> drivers that are just like a C driver with `unsafe` sprinkled
> throughout. If you want to do that, just write a C driver instead.
> 
> As Alice already replied, there should be no issue with having an
> `unsafe` function in an Abstraction. But we should strive for them to be
> as few as possible.
> 
>> I am in no way disagreeing with the use of safe abstractions, but I think we
>> should have abstractions where they make sense. This is the case in the vast
>> majority of times, but not in *all* of them.
>> 
>> A simple example is a MMIO read or write. Should a driver be forbidden to call
>> readX/writeX for an address it knows to be valid? How can you possibly write an
>> abstraction for this, when the driver is the only one aware of the actual
>> device addresses, and when the driver author is the person with actual access
>> to the HW docs?
> 
> One idea that I have in this concrete example would be to make the
> driver specify in exactly one place what the addresses are that are
> read/writeable. If there are devices with dynamic addresses, then we
> could additionally provide an `unsafe` API, but link to the safe one for
> people to prefer.
> 
>> If a driver is written partially in Rust, and partially in C, and it gets a
>> pointer to some kcalloc’d memory in C, should It be forbidden to use unsafe
>> in order to build a slice from that pointer? How can you possibly design a
>> general abstraction for something that is, essentially, a driver-internal API?
> 
> This also would be a good example for an exception of (3). In this case,
> you could still write a driver-specific abstraction that does everything
> under the hood and then every place in the driver can use the safe
> abstraction.
> 
>> For these corner cases, a simple safety comment should suffice. By all means,
>> let's strive to push as much of the unsafe bits into the kernel crate. But,
>> IMHO, we shouldn’t treat Rust drivers as some unprivileged entity, they’re
>> also kernel code, after all.
> 
> That is true, but I want to prevent that we just "ship it" and then a
> couple of days later it turns out that there was a good abstraction
> after all. I personally like to spend a lot of time thinking about safe
> abstractions before giving in to `unsafe`, but I understand that we need
> to find a balance. In the end, we can also always change things. But
> when something lands, it most of the time won't get people thinking
> about whether there is a better way of doing things. Not unless the
> status quo is annoying/burdensome, at which point it already "was too
> late", ie there could have been more thought at the beginning.
> 
> ---
> Cheers,
> Benno

I see,

There has been extensive discussion about this topic in this series and I no
longer see any problems. Thanks everybody for all the clarification provided.

— Daniel