Re: [PATCH] rust: add `module_params` macro

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Benno Lossin" <benno.lossin@xxxxxxxxx> writes:

> On 01.08.24 13:29, Andreas Hindborg wrote:
>> 
>> Hi Benno,
>> 
>> Thanks for the comments!
>> 
>> "Benno Lossin" <benno.lossin@xxxxxxxxx> writes:
>> 
>>> On 05.07.24 13:15, Andreas Hindborg wrote:
>> 
>> [...]
>> 
>>>> +
>>>> +/// Types that can be used for module parameters.
>>>> +///
>>>> +/// Note that displaying the type in `sysfs` will fail if
>>>> +/// [`core::str::from_utf8`] (as implemented through the [`core::fmt::Display`]
>>>> +/// trait) writes more than [`PAGE_SIZE`] bytes (including an additional null
>>>> +/// terminator).
>>>> +///
>>>> +/// [`PAGE_SIZE`]: `bindings::PAGE_SIZE`
>>>> +pub trait ModuleParam: core::fmt::Display + core::marker::Sized {
>>>> +    /// The `ModuleParam` will be used by the kernel module through this type.
>>>> +    ///
>>>> +    /// This may differ from `Self` if, for example, `Self` needs to track
>>>> +    /// ownership without exposing it or allocate extra space for other possible
>>>> +    /// parameter values. This is required to support string parameters in the
>>>> +    /// future.
>>>> +    type Value: ?Sized;
>>>> +
>>>> +    /// Whether the parameter is allowed to be set without an argument.
>>>> +    ///
>>>> +    /// Setting this to `true` allows the parameter to be passed without an
>>>> +    /// argument (e.g. just `module.param` instead of `module.param=foo`).
>>>> +    const NOARG_ALLOWED: bool;
>>>
>>> I think, there is a better way of doing this. Instead of this bool, we
>>> do the following:
>>> 1. have a `const DEFAULT: Option<Self>`
>>> 2. change the type of the argument of `try_from_param_arg` to
>>>    `&'static [u8]`
>>>
>>> That way we don't have the weird behavior of `try_from_param_arg` that
>>> for params that don't have a default value.
>> 
>> Since we have no parameter types for which `NOARG_ALLOWED` is true in
>> this patch set, it is effectively dead code. I will remove it.
>
> Hmm what parameters actually are optional? I looked at the old rust
> branch and only `bool` is marked as optional. Are there others?
>
> If it is used commonly for custom parameters (I could imagine that Rust
> modules have enums as parameters and specifying nothing could mean the
> default value), then it might be a good idea to just include it now.
> (otherwise we might forget the design later)

As far as I can tell from the C code, all parameters are able to have
the `NOARG` flag set. We get a null pointer in the callback in that
case.

If we want to handle this now, we could drop the `default` field
in the Rust module macro. There is no equivalent in the C macros.
And then use an `Option<Option<_>>` to represent the value. `None` would
be an unset parameter. `Some(None)` would be a parameter without a
value. `Some(Some(_))` would be a set parameter with a value. We could
probably fix the types so that only parameters with the `NOARG` flag use
the double option, others use a single option.

Or we could just not adopt this feature in the Rust abstractions.

>
>>>> +
>>>> +    /// Convert a parameter argument into the parameter value.
>>>> +    ///
>>>> +    /// `None` should be returned when parsing of the argument fails.
>>>> +    /// `arg == None` indicates that the parameter was passed without an
>>>> +    /// argument. If `NOARG_ALLOWED` is set to `false` then `arg` is guaranteed
>>>> +    /// to always be `Some(_)`.
>
> [...]
>
>>>> diff --git a/rust/macros/helpers.rs b/rust/macros/helpers.rs
>>>> index 563dcd2b7ace..dc0b47879a8c 100644
>>>> --- a/rust/macros/helpers.rs
>>>> +++ b/rust/macros/helpers.rs
>>>> @@ -107,6 +107,14 @@ pub(crate) struct Generics {
>>>>      pub(crate) ty_generics: Vec<TokenTree>,
>>>>  }
>>>>
>>>> +pub(crate) fn get_string(it: &mut token_stream::IntoIter, expected_name: &str) -> String {
>>>
>>> This name is rather weird, `get_field` makes more sense IMO.
>> 
>> Looking at this, the `get` prefix is not aligned with other helpers. How
>> about `expect_string_field` ?
>
> SGTM
>
>>>> +    assert_eq!(expect_ident(it), expected_name);
>>>> +    assert_eq!(expect_punct(it), ':');
>>>> +    let string = expect_string(it);
>>>> +    assert_eq!(expect_punct(it), ',');
>>>
>>> Why do we require a trailing comma?
>> 
>> For consistency with existing module macro. All keys must be terminated
>> with comma.
>
> Hmm I think that might be a bit unexpected, since everywhere else in
> Rust you are allowed to omit the trailing comma. But I guess if the
> entire `module!` macro does that currently, then we can change that when
> we move to `syn`.

Yes, I agree.

>
> [...]
>
>>>> +                    param_type.to_string(),
>>>> +                    param_ops_path(&param_type).to_string(),
>>>> +                );
>>>> +
>>>> +                self.emit_param("parmtype", &param_name, &param_kernel_type);
>>>
>>> Is the spelling intentional? "parmtype"?
>> 
>> This is intentional. I don't think the kernel is ever parsing this, but
>> it is parsed by the `modinfo` tool.
>
> Hmm, why is it not `paramtype`? Does that conflict with something?

You would have to take that up with the maintainer(s) of the `modinfo`
tool. The name is externally dictated [1].

>
>>>> +                self.emit_param("parm", &param_name, &param_description);
>>>> +                let param_type_internal = param_type.clone();
>>>> +
>>>> +                let read_func = format!(
>>>> +                    "
>>>> +                        pub(crate) fn read(&self)
>>>> +                            -> &<{param_type_internal}
>>>> +                               as kernel::module_param::ModuleParam>::Value {{
>>>
>>> Please add a `::` in front of `kernel::module_param::ModuleParam`. There
>>> are more instances below.
>> 
>> Thanks.
>> 
>>>
>>>> +                            // Note: when we enable r/w parameters, we need to lock here.
>>>> +
>>>> +                            // SAFETY: Parameters do not need to be locked because they are
>>>> +                            // read only or sysfs is not enabled.
>>>> +                            unsafe {{
>>>> +                                <{param_type_internal} as kernel::module_param::ModuleParam>::value(
>>>> +                                    &__{name}_{param_name}_value
>>>> +                                )
>>>> +                            }}
>>>> +                        }}
>>>> +                    ",
>>>> +                    name = info.name,
>>>> +                    param_name = param_name,
>>>> +                    param_type_internal = param_type_internal,
>>>> +                );
>>>> +
>>>> +                let kparam = format!(
>>>> +                    "
>>>> +                    kernel::bindings::kernel_param__bindgen_ty_1 {{
>>>> +                        // SAFETY: Access through the resulting pointer is
>>>> +                        // serialized by C side and only happens before module
>>>> +                        // `init` or after module `drop` is called.
>>>> +                        arg: unsafe {{ &__{name}_{param_name}_value }}
>>>> +                            as *const _ as *mut core::ffi::c_void,
>>>
>>> Here you should use `addr_of[_mut]!` instead of taking a reference.
>> 
>> This is a static initializer, so it would be evaluated in const context.
>> At that time, this is going to be the only reference to
>> `&__{name}_{param_name}_value` which would be const. So it should be
>> fine?
>
> When compiling this [1] with a sufficiently new Rust version, you will
> get an error:
>
>     warning: creating a shared reference to mutable static is discouraged
>      --> src/main.rs:4:22
>       |
>     4 |     let x = unsafe { &foo };
>       |                      ^^^^ shared reference to mutable static
>       |
>       = note: for more information, see issue #114447 <https://github.com/rust-lang/rust/issues/114447>
>       = note: this will be a hard error in the 2024 edition
>       = note: this shared reference has lifetime `'static`, but if the static ever gets mutated, or a mutable reference is created, then any further use of this shared reference is Undefined Behavior
>       = note: `#[warn(static_mut_refs)]` on by default
>     help: use `addr_of!` instead to create a raw pointer
>       |
>     4 |     let x = unsafe { addr_of!(foo) };
>       |                      ~~~~~~~~~~~~~
>
> [1]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c914a438938be6f5fc643ee277efa1d1
>
> So I think we should start using `addr_of!` for mutable static now.

Oh. Thanks for the pointer.

Hmm, `addr_of_mut!` still requires the unsafe block. Hopefully that goes
away as well with the feature you linked as well.

This also requires `const_mut_refs`, but as I recall that is going to be
stabilized soon.

>
>> The safety comment is wrong though.
>> 
>>> Also
>>> will this pointer be used to write to the static, in that case you need
>>> `_mut!`.
>> 
>> Not in this version of the patch set, but potentially in future iterations.
>
> All the more reason to use `addr_of!` IMO.
>
>>>> +                    }},
>>>> +                ",
>>>> +                    name = info.name,
>>>> +                    param_name = param_name,
>>>> +                );
>>>
>>> What is the reason for putting `kparam` and `read_func` outside of the
>>> `write!` below? I think it would be easier to read if they are inlined.
>> 
>> It had different shapes based on other options in the original patch
>> set. I guess I can just inline it in this version.
>> 
>>>
>>>> +                write!(
>>>> +                    self.param_buffer,
>>>> +                    "
>>>> +                static mut __{name}_{param_name}_value: {param_type_internal} = {param_default};
>>>> +
>>>> +                pub(crate) struct __{name}_{param_name};
>>>> +
>>>> +                impl __{name}_{param_name} {{ {read_func} }}
>>>> +
>>>> +                pub(crate) const {param_name}: __{name}_{param_name} = __{name}_{param_name};
>>>
>>> Why do we need a unit struct as a constant? I think it would make more
>>> sense to have a unit struct/empty enum as the type and the `read`
>>> function be without a receiver.
>> 
>> To be able to call `module_parameters::my_parameter.read()`. Other
>> options would be `module_parameters::my_parameter::read()` or
>> `module_parameters::my_parameter_read()`.
>> 
>> I don't think there will be a difference in the generated machine code.
>> I also don't have any particular preference. Probably
>> `module_parameters::my_parameter::read()` is the most idiomatic one.
>
> Yeah, I would prefer if we can avoid having both a constant and a type.
> The type then also can be an empty enum, so no value can be constructed.

Nice trick 👍

>
>>>> +
>>>> +                // Note: the C macro that generates the static structs for the `__param` section
>>>> +                // asks for them to be `aligned(sizeof(void *))`. However, that was put in place
>>>> +                // in 2003 in commit 38d5b085d2a0 (\"[PATCH] Fix over-alignment problem on x86-64\")
>>>> +                // to undo GCC over-alignment of static structs of >32 bytes. It seems that is
>>>> +                // not the case anymore, so we simplify to a transparent representation here
>>>> +                // in the expectation that it is not needed anymore.
>>>> +                // TODO: Revisit this to confirm the above comment and remove it if it happened.
>>>
>>> Should this TODO be fixed before this is merged? Or do you intend for it
>>> to stay?
>>> If this is indeed correct, should this also be changed in the C side (of
>>> course a different patch)?
>> 
>> I dug into this. The original code in this patch must be quite old,
>> because that the code the comment refers to was changed in Nov 2020 from
>> `aligned(sizeof(void *))` to `__aligned(__alignof__(struct
>> kernel_param))`. The commit message says that the rationale for not
>> removing the alignment completely is to prevent the compiler from
>> increasing the alignment, as this would mess up the array stride used in
>> the `__param` section.
>> 
>> So I think we can remove the comment and keep `repr(transparent)`, right?
>> I think `rustc` would not increase the alignment of a `repr(C)` struct
>> for optimization purposes?
>
> I don't know that, maybe Gary or someone else knows how this works.
>
>>>> +                /// Newtype to make `bindings::kernel_param` `Sync`.
>>>> +                #[repr(transparent)]
>>>> +                struct __{name}_{param_name}_RacyKernelParam(kernel::bindings::kernel_param);
>>>> +
>>>> +                // SAFETY: C kernel handles serializing access to this type. We
>>>> +                // never access from Rust module.
>>>> +                unsafe impl Sync for __{name}_{param_name}_RacyKernelParam {{
>>>> +                }}
>>>
>>> Any reason to put the `}` on the next line?
>> 
>> No. Do you have any tricks for formatting multi line strings of code like this?
>
> Not really, I don't think that this is a big deal, since this will
> eventually be replaced by `syn`, which can be formatted more easily.
>
>>>> +
>>>> +                #[cfg(not(MODULE))]
>>>> +                const __{name}_{param_name}_name: *const core::ffi::c_char =
>>>> +                    b\"{name}.{param_name}\\0\" as *const _ as *const core::ffi::c_char;
>>>> +
>>>> +                #[cfg(MODULE)]
>>>> +                const __{name}_{param_name}_name: *const core::ffi::c_char =
>>>> +                    b\"{param_name}\\0\" as *const _ as *const core::ffi::c_char;
>>>> +
>>>> +                #[link_section = \"__param\"]
>>>> +                #[used]
>>>> +                static __{name}_{param_name}_struct: __{name}_{param_name}_RacyKernelParam =
>>>> +                    __{name}_{param_name}_RacyKernelParam(kernel::bindings::kernel_param {{
>>>> +                        name: __{name}_{param_name}_name,
>>>> +                        // SAFETY: `__this_module` is constructed by the kernel at load time
>>>> +                        // and will not be freed until the module is unloaded.
>>>> +                        #[cfg(MODULE)]
>>>> +                        mod_: unsafe {{ &kernel::bindings::__this_module as *const _ as *mut _ }},
>>>> +                        #[cfg(not(MODULE))]
>>>> +                        mod_: core::ptr::null_mut(),
>>>> +                        // SAFETY: This static is actually constant as seen by
>>>> +                        // module code. But we need a unique address for it, so it
>>>> +                        // must be static.
>>>
>>> This safety comment makes no sense, should it be a normal comment?
>> 
>> I removed the unsafe block and the safety comment as unsafe is not
>> required here.
>> 
>>>
>>>> +                        ops: unsafe {{ &{ops} }} as *const kernel::bindings::kernel_param_ops,
>>>
>>> Why is this `unsafe` block needed, the `make_param_ops` macro declares a
>>> non-mut static.
>>>
>>>> +                        perm: 0, // Will not appear in sysfs
>>>> +                        level: -1,
>>>
>>> Why this value?
>> 
>> The kernel has 8 initcall levels. Parameters can be assigned one of
>> these levels to have the parameter initialized just before the init
>> functions for that level are executed. -1 has no effect for loadable modules, but
>> for built-in modules it looks like the args will be initialized just after early
>> boot args (level 0).
>> 
>> At any rate, this is what C side does.
>
> I see, I was just wondering where the magic value comes from (especially
> since the `perm` value has a comment explaining what it does).

I don't think we should add a comment here. The `level` field is not
well documented on C side. Probably the best thing here is to force
people to go read the C source.


Best regards,
Andreas

[1] https://github.com/kmod-project/kmod/blob/af21689dd0f1ef6f40d6ecc323885026a07486f9/tools/modinfo.c#L118







[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux