Re: [PATCH v2 16/19] gendwarfksyms: Add support for reserved structure fields

Benno Lossin <benno.lossin@xxxxxxxxx> · Mon, 19 Aug 2024 21:46:59 +0000

On 19.08.24 20:25, Greg Kroah-Hartman wrote:
> On Sat, Aug 17, 2024 at 01:19:55PM +0000, Benno Lossin wrote:
>> On 17.08.24 09:41, Greg Kroah-Hartman wrote:
>>> On Fri, Aug 16, 2024 at 08:50:53AM -0700, Sami Tolvanen wrote:
>>>> On Fri, Aug 16, 2024 at 12:20 AM Greg Kroah-Hartman
>>>> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>>>>> On Thu, Aug 15, 2024 at 05:39:20PM +0000, Sami Tolvanen wrote:
>>>>> Especially as I have no idea how you are going to do
>>>>> this with the rust side of things, this all will work for any structures
>>>>> defined in .rs code, right?
>>>>
>>>> Yes, Rust structures can use the same scheme. Accessing union members
>>>> might be less convenient than in C, but can presumably be wrapped in
>>>> helper macros if needed.
>>>
>>> That feels ripe for problems for any rust code as forcing a helper macro
>>> for a "normal" access to a structure field is going to be a lot of churn
>>> over time.  Is the need for a macro due to the fact that accessing a
>>> union is always considered "unsafe" in rust?  If that's the case, ick,
>>> this is going to get even messier even faster as the need for sprinkling
>>> unsafe accesses everywhere for what used to be a normal/safe one will
>>> cause people to get nervous...
>>
>> The reason for union field access being unsafe in Rust is that you can
>> easily shoot yourself in the foot. For example:
>>
>>     union Foo {
>>         a: bool,
>>         b: i32,
>>     }
>>
>>     let foo = Foo { b: 3 };
>>     println!("{}", unsafe { foo.a });
>>
>> This is UB, since `3` is of course not a valid value for `bool`. With
>> unions the compiler doesn't know which variant is active.
> 
> Understood, then why attempt to use a union for this type of "abi safe
> padding"?

I don't follow, I thought this was the idea from the thread above. (ie
just do what C does)

>> Since unions are unsafe in Rust, we don't really use them directly (in
>> the `kernel` crate, we have 0 union definitions). Instead we use certain
>> unions from the stdlib such as `MaybeUninit`. But the fields of that
>> union are private and never accessed.
>>
>> In general, unions in Rust are very important primitive types, but they
>> are seldomly used directly. Instead enums are used a lot more, since you
>> don't need to roll your own tagged unions.
>>
>> For this use-case (the one in the patch), I don't really know if we want
>> to copy the approach from C. Do we even support exporting kABI from
>> Rust?
> 
> That's the goal here, you want to create an abi that can change over
> time without "breaking" the abi.  Usually this is just adding additional
> padding in structures to have room for new additions.
> 
>> If yes, then we I would recommend we tag it in the source code
>> instead of using a union. Here the example from the patch adapted:
>>
>>     #[repr(C)] // needed for layout stability
>>     pub struct Struct1 {
>>         a: u64,
>>         #[kabi_reserved(u64)] // this marker is new
>>         _reserved: u64,
>>     }
>>
>> And then to use the reserved field, you would do this:
>>
>>     #[repr(C)]
>>     pub struct Struct1 {
>>         a: u64,
>>         #[kabi_reserved(u64)]
>>         b: Struct2,
>>     }
>>
>>     #[repr(C)]
>>     pub struct Struct2 {
>>         b: i32,
>>         v: i32,
>>     }
>>
>> The attribute would check that the size of the two types match and
>> gendwarfksyms would use the type given in "()" instead of the actual
>> type.
> 
> Remember the "goal" here is to NOT have to modify the places in the
> kernel that use the new field in the structure, but for that to "just
> work".  Your change here wouldn't allow that as any use of the new "b"
> field would have to be through something in "Struct2", not directly in
> Struct1, right?

This confuses me, since I thought that in C you would need to use the
new fields. So for example we have

    void increment(struct struct1 *x)
    {
        x->a += 1;
    }

and then we do the extension and also want to increment `b`, then we
have to do this:

    void increment(struct struct1 *x)
    {
        x->a += 1;
        x->b += 1;
    }

I am not 100% sure if you meant the following problem: if a user uses
the `increment` function like this:

    struct struct1 x = { .a = 0, .__kabi_reserved_0 = 0 };
    increment(&x);

Then in the C example they don't need to change their usage. In Rust,
they would, but we shouldn't make the struct fields public, then they
can't create the struct using the initializer syntax, but instead we
would provide stable initialization functions. Translating the entire
example to Rust:

    impl Struct1 {
        pub fn new(a: u64) -> Self {
            Self { a, _reserved: 0 }
        }

        pub fn increment(&mut self) {
            self.a += 1;
        }
    }

Then after adding the new field, this becomes:

    impl Struct1 {
        pub fn new(a: u64) -> Self {
            Self { a, b: Struct2 { b: 0, v: 0 } }
        }

        pub fn increment(&mut self) {
            self.a += 1;
            self.b.b += 1;
        }
    }

So the following user would also not have to change the code:

    let mut x = Struct1::new(0);
    x.increment();

> We can mess with the structure definitions but we should not have to
> touch the places where the structure fields are used at all.  If that's
> going to be a requirement (as it sounds like it would with the use of
> unsafe in the union), then this is not going to be a solution at all.

So the union approach *could* work and the users of the API would not
have to use `unsafe`. But doing it the way I suggest above is going to
be cleaner, as the people who use the API won't ever need to know of the
internals.

---
Cheers,
Benno