Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Benno Lossin <benno.lossin@xxxxxxxxx> · Thu, 25 Jul 2024 20:32:10 +0000

On 25.07.24 22:06, Boqun Feng wrote:
> Hi Benno,
> 
> Thanks for taking a look.
> 
> On Thu, Jul 25, 2024 at 06:51:56PM +0000, Benno Lossin wrote:
>> On 10.07.24 05:24, Boqun Feng wrote:
>>> As the usage of `ARef` and `AlwaysRefCounted` is growing, it makes sense
>>> to add explanation of the "ARef pattern" to cover the most "DO" and "DO
>>> NOT" cases when wrapping a self-refcounted C type.
>>>
>>> Hence an "ARef pattern" section is added in the documentation of `ARef`.
>>>
>>> Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>
>>> ---
>>> This is motivated by:
>>>
>>> 	https://lore.kernel.org/rust-for-linux/20240705110228.qqhhynbwwuwpcdeo@vireshk-i7/
>>>
>>>  rust/kernel/types.rs | 156 +++++++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 156 insertions(+)
>>>
>>> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
>>> index bd189d646adb..70fdc780882e 100644
>>> --- a/rust/kernel/types.rs
>>> +++ b/rust/kernel/types.rs
>>> @@ -329,6 +329,162 @@ pub unsafe trait AlwaysRefCounted {
>>>  ///
>>>  /// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In
>>>  /// particular, the [`ARef`] instance owns an increment on the underlying object's reference count.
>>> +///
>>> +/// # [`ARef`] pattern
>>> +///
>>> +/// "[`ARef`] pattern" is preferred when wrapping a C struct which has its own refcounting
>>
>> I would have written "[...] struct which is reference-counted, because
>> [...]", is there a specific reason you wrote "its own"?
>>
> 
> "its own" indicates the reference counters are inside the object (i.e.
> self refcounted), it's different than `Arc<T>` where the reference
> counters are "attached" to `T`. Your version looks good to me as well.

I thought about that as well, but the paragraph above talks about a C
struct, so what is meant with "its own" there?

>>> +/// mechanism, because it decouples the operations on the object itself (usually via a `&Foo`) vs the
>>> +/// operations on a pointer to the object (usually via an `ARef<Foo>`). For example, given a `struct
>>
>> Not exactly sure I understand your point here, what exactly is the
>> advantage of decoupling the operations?
>> In my mind the following points are the advantages of using `ARef`:
>> (1) prevents having to implement multiple abstractions for a single C
>>     object: say there is a `struct foo` that is both used via reference
>>     counting and by-value on the stack. Without `ARef`, we would have to
>>     write two abstractions, one for each use-case. With `ARef`, we can
>>     have one `Foo` that can be wrapped with `ARef` to represent a
>>     reference-counted object.
>> (2) `ARef<T>` always represents a reference counted object, so it helps
>>     with understanding the code. If you read `Foo`, you cannot be sure
>>     if it is heap or stack allocated.
>> (3) generalizes common code of reference-counted objects (ie avoiding
>>     code duplication) and concentration of `unsafe` code.
>>
>> In my opinion (1) is the most important, then (2). And (3) is a nice
>> bonus. If you agree with the list above (maybe you also have additional
>> advantages of `ARef`?) then it would be great if you could also add them
>> somewhere here.
>>
> 
> Basically to me, the advantages are mostly (1) and (2) in your list,
> thank you for the list. And I did try to use an example (below) to
> explain these, because I felt an example of the bad cases is
> straightforward.
> 
> I will add your list here, because although an example may be
> straightforward of reading, a list of advantages are better for
> references. Again, thanks a lot!
> 
>>> +/// foo` defined in C, which has its own refcounting operations `get_foo()` and `put_foo()`. Without
>>> +/// "[`ARef`] pattern", i.e. **bad case**:
>>
>> Instead of "bad case" I would have written "i.e. you want to avoid this:".
>>
> 
> I'm OK with your version, but for my personal interest, why? ;-)

I felt like "bad case" did not "flow" right when reading and I also
think that "you want to avoid this" sounds more polite :)

>>> +///
>>> +/// ```ignore
>>> +/// pub struct Foo(NonNull<foo>);
>>> +///
>>> +/// impl Foo {
>>> +///     // An operation on the pointer.
>>> +///     pub unsafe fn from_ptr(ptr: *mut foo) -> Self {
>>> +///         // Note that whether `get_foo()` is needed here depends on the exact semantics of
>>> +///         // `from_ptr()`: is it creating a new reference, or it continues using the caller's
>>> +///         // reference?
>>> +///         unsafe { get_foo(ptr); }
>>> +///
>>> +///         unsafe { Foo(NonNull::new_unchecked(foo)) }
>>> +///     }
>>> +///
>>> +///     // An operation on the object.
>>> +///     pub fn get_bar(&self) -> Bar {
>>> +///         unsafe { (*foo.0.as_ptr()).bar }
>>> +///     }
>>> +/// }
>>> +///
>>> +/// // Plus `impl Clone` and `impl Drop` are also needed to implement manually.
>>> +/// impl Clone for Foo {
>>> +///     fn clone(&self) -> Self {
>>> +///         unsafe { get_foo(self.0.as_ptr()); }
>>> +///
>>> +///         Foo(self.0)
>>> +///     }
>>> +/// }
>>> +///
>>> +/// impl Drop for Foo {
>>> +///     fn drop(&mut self) {
>>> +///         unsafe { put_foo(self.0.as_ptr()); }
>>> +///     }
>>> +/// }
>>> +/// ```
>>> +///
>>> +/// In this case, it's hard to tell whether `Foo` represent an object of `foo` or a pointer to
>>> +/// `foo`.
>>> +///
>>> +/// However, if using [`ARef`] pattern, `foo` can be wrapped as follow:
>>> +///
>>> +/// ```ignore
>>> +/// /// Note: `Opaque` is needed in most cases since there usually exist C operations on
>>
>> I would disagree for the reason that `Opaque` is needed. You need it if
>> the `foo` eg contains a bool, since C might just write a nonsense
>> integer which would then result in immediate UB in Rust.
>> Other reasons might be that certain bytes of `foo` are written to by
>> other threads, even though on the Rust side we have `&mut Foo` (eg a
>> `mutex`).
>>
> 
> hmm.. "since there usually exist C operations on ..." include these two
> cases you mentioned, no? Plus, the reference counters themselves are not
> marked as atomic at the moment, so without `Opaque`, we also have UB
> because of the reference counters. I was trying to summarize all these
> as "C operations on ...", maybe I should say "concurrent C operations on
> ..."? I am trying to be concise here since it's a comment inside a
> comment ;-)

Ah that is your definition of "C operations", I interpreted it as "there
are functions that take `struct foo *`". So maybe it would be good to
spell out exactly why `Opaque` might be needed.
I think its fine to be verbose here.

---
Cheers,
Benno

>>> +/// /// `struct foo *`, and `#[repr(transparent)]` is needed for the safety of converting a `*mut
>>> +/// /// foo` to a `*mut Foo`
>>> +/// #[repr(transparent)]
>>> +/// pub struct Foo(Opaque<foo>);
>>> +///
>>> +/// impl Foo {
>>> +///     pub fn get_bar(&self) -> Bar {
>>> +///         // SAFETY: `self.0.get()` is a valid pointer.
>>> +///         //
>>> +///         // Note: Usually extra safety comments are needed here to explain why accessing `.bar`
>>> +///         // doesn't race with C side. Most cases are either calling a C function, which has its
>>> +///         // own concurrent access protection, or holding a lock.
>>> +///         unsafe { (*self.0.get()).bar }
>>> +///     }
>>> +/// }
>>> +/// ```
>>> +///
>>> +/// ## Avoid `impl AlwaysRefCounted` if unnecesarry