Re: [RFC] rust: types: Add read_once and write_once

Benno Lossin <benno.lossin@xxxxxxxxx> · Thu, 26 Oct 2023 07:30:23 +0000

On 26.10.23 01:02, Boqun Feng wrote:
> On Wed, Oct 25, 2023 at 09:51:28PM +0000, Benno Lossin wrote:
>>> In theory, `read_volatile` and `write_volatile` in Rust can have UB in
>>> case of the data races [1]. However, kernel uses volatiles to implement
>>
>> I would not write "In theory", but rather state that data races involving
>> `read_volatile` is documented to still be UB.
>>
> 
> Good point.
> 
>>> READ_ONCE() and WRITE_ONCE(), and expects races on these marked accesses
>>
>> Missing "`"?
>>
> 
> Yeah, but these are C macros, and here is the commit log, so I wasn't so
> sure I want to add "`", but I guess it's good for consistency.

I was just wondering if it was intentional, since you did it below.

>>> don't cause UB. And they are proven to have a lot of usages in kernel.
>>>
>>> To close this gap, `read_once` and `write_once` are introduced, they
>>> have the same semantics as `READ_ONCE` and `WRITE_ONCE` especially
>>> regarding data races under the assumption that `read_volatile` and
>>
>> I would separate implementation from specification. We specify
>> `read_once` and `write_once` to have the same semantics as `READ_ONCE`
>> and `WRITE_ONCE`. But we implement them using
>> `read_volatile`/`write_volatile`, so we might still encounter UB, but it
>> is still a sort of best effort. As soon as we have the actual thing in
>> Rust, we will switch the implementation.
>>
> 
> Sounds good, I will use this in the next version.
> 
>>> `write_volatile` have the same behavior as a volatile pointer in C from
>>> a compiler point of view.
>>>
>>> Longer term solution is to work with Rust language side for a better way
>>> to implement `read_once` and `write_once`. But so far, it should be good
>>> enough.
>>>
>>> Suggested-by: Alice Ryhl <aliceryhl@xxxxxxxxxx>
>>> Link: https://doc.rust-lang.org/std/ptr/fn.read_volatile.html#safety [1]
>>> Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>
>>> ---
>>>
>>> Notice I also make the primitives only work on T: Copy, since I don't
>>> think Rust side and C side will use a !Copy type to communicate, but we
>>> can always remove that constrait later.
>>>
>>>
>>>   rust/kernel/prelude.rs |  2 ++
>>>   rust/kernel/types.rs   | 43 ++++++++++++++++++++++++++++++++++++++++++
>>>   2 files changed, 45 insertions(+)
>>>
>>> diff --git a/rust/kernel/prelude.rs b/rust/kernel/prelude.rs
>>> index ae21600970b3..351ad182bc63 100644
>>> --- a/rust/kernel/prelude.rs
>>> +++ b/rust/kernel/prelude.rs
>>> @@ -38,3 +38,5 @@
>>>   pub use super::init::{InPlaceInit, Init, PinInit};
>>>
>>>   pub use super::current;
>>> +
>>> +pub use super::types::{read_once, write_once};
>>
>> Do we really want people to use these so often that they should be in
>> the prelude?
>>
> 
> The reason I prelude them is because that `READ_ONCE` and `WRITE_ONCE`
> have total ~7000 users in kernel, but now think about it, maybe it's
> better not.

I think we should start out with it not in the prelude. Drivers should
not need to call this often (I hope that only abstractions actually need
this).

>> Sure there will not really be any name conflicts, but I think an
>> explicit import might make sense.
>>
>>> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
>>> index d849e1979ac7..b0872f751f97 100644
>>> --- a/rust/kernel/types.rs
>>> +++ b/rust/kernel/types.rs
>>
>> I don't think this should go into `types.rs`. But I do not have a good
>> name for the new module.
>>
> 
> kernel::sync?

I like that.

>>> @@ -432,3 +432,46 @@ pub enum Either<L, R> {
>>>       /// Constructs an instance of [`Either`] containing a value of type `R`.
>>>       Right(R),
>>>   }
>>> +
>>> +/// (Concurrent) Primitives to interact with C side, which are considered as marked access:
>>> +///
>>> +/// tools/memory-memory/Documentation/access-marking.txt
>>> +
>>
>> Accidental empty line? Or is this meant as a comment for both
>> functions?
>>
> 
> Right, it's the documentation for both functions.

That will not work, it will just be rendered only on `read_once`.

Maybe just copy it to both and also cross link the two functions.
So `read_once` mentions the counterpart `write_once`.

>>> +/// The counter part of C `READ_ONCE()`.
>>> +///
>>> +/// The semantics is exactly the same as `READ_ONCE()`, especially when used for intentional data
>>> +/// races.
>>
>> It would be great if these semantics are either linked or spelled out
>> here. Ideally both.
>>
> 
> Actually I haven't found any document about `READ_ONCE()` races with
> writes is not UB: we do have document saying `READ_ONCE()` will disable
> KCSAN checks, but no document says (explicitly) that it's not a UB.
> 
>>> +///
>>> +/// # Safety
>>> +///
>>> +/// * `src` must be valid for reads.
>>> +/// * `src` must be properly aligned.
>>> +/// * `src` must point to a properly initialized value of value `T`.
>>> +#[inline(always)]
>>> +pub unsafe fn read_once<T: Copy>(src: *const T) -> T {
>>
>> Why only `T: Copy`?
>>
> 
> I actually explained this above, after "---" of the commit log, but

Oh I missed that, sorry.

> maybe it's worth its own documentation? The reason that it only works

Yeah, lets document it. Otherwise I agree with your reasoning.

> with `T: Copy`, is that these primitives should be mostly used for
> C/Rust communication, and using a `T: !Copy` is probably wrong (or at
> least complicated) for communication, since users need to handle which
> one should be used after `read_once()`. This is in the same spirit as
> `read_volatile` documentation:
> 
> ```
> Like read, read_volatile creates a bitwise copy of T, regardless of
> whether T is Copy. If T is not Copy, using both the returned value and
> the value at *src can violate memory safety. However, storing non-Copy
> types in volatile memory is almost certainly incorrect.
> ```
> 
> I want to start with restrict usage.
> 
>>> +    // SAFETY: the read is valid because of the function's safety requirement, plus the assumption
>>> +    // here is that 1) a volatile pointer dereference in C and 2) a `read_volatile` in Rust have the
>>> +    // same semantics, so this function should have the same behavior as `READ_ONCE()` regarding
>>> +    // data races.
>>
>> I would explicitly state that we might have UB here due to data races.
>> But that we have not seen any invalid codegen and thus assume there to
> 
> I'd rather not claim this (no invalid codegen), not because it's not
> true, but because it's not under our control. We have written doc in

But it is under our control, we pin the compiler version and can always
just check if the codegen is correct. If someone finds that it is not,
we also want to be informed, so I think we should write that we rely on
it here.

> Rust says:
> 
> ```
> ... so the precise semantics of what “volatile” means here is subject
> to change over time. That being said, the semantics will almost always
> end up pretty similar to C11’s definition of volatile.
> ```

But this is not a guarantee, that they behave exactly the same as C11
_now_.

-- 
Cheers,
Benno

> , so we have some confidence to say `read_volatile` equals to a volatile
> read, and `write_volatile` equals to a volatile write. Therefore, we can
> assume they have the same behaviors as `READ_ONCE()` and `WRITE_ONCE()`,
> but that's it. Going futher to talk about codegen means we have more
> guarantee from Rust compiler implementation.
> 
> In other words, we are not saying racing `read_volatile`s don't have
> UBs, we are saying racing `read_volatile`s behave the same as a volatile
> read on UBs.