On 03.06.24 14:01, Andreas Hindborg wrote: > Benno Lossin <benno.lossin@xxxxxxxxx> writes: >> On 01.06.24 15:40, Andreas Hindborg wrote: >>> +impl seal::Sealed for Initialized {} >>> +impl GenDiskState for Initialized { >>> + const DELETE_ON_DROP: bool = false; >>> +} >>> +impl seal::Sealed for Added {} >>> +impl GenDiskState for Added { >>> + const DELETE_ON_DROP: bool = true; >>> +} >>> + >>> +impl<T: Operations> GenDisk<T, Initialized> { >>> + /// Try to create a new `GenDisk`. >>> + pub fn try_new(tagset: Arc<TagSet<T>>) -> Result<Self> { >> >> Since there is no non-try `new` function, I think we should name this >> function just `new`. > > Right, I am still getting used to the new naming scheme. Do you know if > it is documented anywhere? I don't think it is documented, it might only be a verbal convention at the moment. Although [1] is suggesting `new` for general constructors. Since this is the only constructor, one could argue that the recommendation is to use `new` (which I personally find a good idea). [1]: https://rust-lang.github.io/api-guidelines/naming.html [...] >>> +impl<T: Operations> OperationsVTable<T> { >>> + /// This function is called by the C kernel. A pointer to this function is >>> + /// installed in the `blk_mq_ops` vtable for the driver. >>> + /// >>> + /// # Safety >>> + /// >>> + /// - The caller of this function must ensure `bd` is valid >>> + /// and initialized. The pointees must outlive this function. >> >> Until when do the pointees have to be alive? "must outlive this >> function" could also be the case if the pointees die immediately after >> this function returns. > > It should not be plural. What I intended to communicate is that what > `bd` points to must be valid for read for the duration of the function > call. I think that is what "The pointee must outlive this function" > states? Although when we talk about lifetime of an object pointed to by > a pointer, I am not sure about the correct way to word this. Do we talk > about the lifetime of the pointer or the lifetime of the pointed to > object (the pointee). We should not use the same wording for the pointer > and the pointee. > > How about: > > /// - The caller of this function must ensure that the pointee of `bd` is > /// valid for read for the duration of this function. But this is not enough for it to be sound, right? You create an `ARef` from `bd.rq`, which potentially lives forever. You somehow need to require that the pointer `bd` stays valid for reads and (synchronized) writes until the request is ended (probably via `blk_mq_end_request`). >>> + /// - This function must not be called with a `hctx` for which >>> + /// `Self::exit_hctx_callback()` has been called. >>> + /// - (*bd).rq must point to a valid `bindings:request` for which >>> + /// `OperationsVTable<T>::init_request_callback` was called >> >> Missing `.` at the end. > > Thanks. > >> >>> + unsafe extern "C" fn queue_rq_callback( >>> + _hctx: *mut bindings::blk_mq_hw_ctx, >>> + bd: *const bindings::blk_mq_queue_data, >>> + ) -> bindings::blk_status_t { >>> + // SAFETY: `bd.rq` is valid as required by the safety requirement for >>> + // this function. >>> + let request = unsafe { &*(*bd).rq.cast::<Request<T>>() }; >>> + >>> + // One refcount for the ARef, one for being in flight >>> + request.wrapper_ref().refcount().store(2, Ordering::Relaxed); >>> + >>> + // SAFETY: We own a refcount that we took above. We pass that to `ARef`. >>> + // By the safety requirements of this function, `request` is a valid >>> + // `struct request` and the private data is properly initialized. >>> + let rq = unsafe { Request::aref_from_raw((*bd).rq) }; >> >> I think that you need to require that the request is alive at least >> until `blk_mq_end_request` is called for the request (since at that >> point all `ARef`s will be gone). >> Also if this is not guaranteed, the safety requirements of >> `AlwaysRefCounted` are violated (since the object can just disappear >> even if it has refcount > 0 [the refcount refers to the Rust refcount in >> the `RequestDataWrapper`, not the one in C]). > > Yea, for the last invariant of `Request`: > > /// * `self` is reference counted by atomic modification of > /// self.wrapper_ref().refcount(). > > I will add this to the safety comment at the call site: > > // - `rq` will be alive until `blk_mq_end_request` is called and is > // reference counted by `ARef` until then. Seems like you already want to use this here :) [...] >>> + /// This function is called by the C kernel. A pointer to this function is >>> + /// installed in the `blk_mq_ops` vtable for the driver. >>> + /// >>> + /// # Safety >>> + /// >>> + /// This function may only be called by blk-mq C infrastructure. `set` must `set` doesn't exist (`_set` does), you are also not using this requirement. >>> + /// point to an initialized `TagSet<T>`. >>> + unsafe extern "C" fn init_request_callback( >>> + _set: *mut bindings::blk_mq_tag_set, >>> + rq: *mut bindings::request, >>> + _hctx_idx: core::ffi::c_uint, >>> + _numa_node: core::ffi::c_uint, >>> + ) -> core::ffi::c_int { >>> + from_result(|| { >>> + // SAFETY: The `blk_mq_tag_set` invariants guarantee that all >>> + // requests are allocated with extra memory for the request data. >> >> What guarantees that the right amount of memory has been allocated? >> AFAIU that is guaranteed by the `TagSet` (but there is no invariant). > > It is by C API contract. `TagSet`::try_new` (now `new`) writes > `cmd_size` into the `struct blk_mq_tag_set`. That is picked up by > `blk_mq_alloc_tag_set` to allocate the right amount of space for each request. > > The invariant here is on the C type. Perhaps the wording is wrong. I am > not exactly sure how to express this. How about this: > > // SAFETY: We instructed `blk_mq_alloc_tag_set` to allocate requests > // with extra memory for the request data when we called it in > // `TagSet::new`. I think you need a safety requirement on the function: `rq` points to a valid `Request`. Then you could just use `Request::wrapper_ptr` instead of the line below. >>> + let pdu = unsafe { bindings::blk_mq_rq_to_pdu(rq) }.cast::<RequestDataWrapper>(); >>> + >>> + // SAFETY: The refcount field is allocated but not initialized, this >>> + // valid for write. >>> + unsafe { RequestDataWrapper::refcount_ptr(pdu).write(AtomicU64::new(0)) }; >>> + >>> + Ok(0) >>> + }) >>> + } >> >> [...] >> >>> + /// Notify the block layer that a request is going to be processed now. >>> + /// >>> + /// The block layer uses this hook to do proper initializations such as >>> + /// starting the timeout timer. It is a requirement that block device >>> + /// drivers call this function when starting to process a request. >>> + /// >>> + /// # Safety >>> + /// >>> + /// The caller must have exclusive ownership of `self`, that is >>> + /// `self.wrapper_ref().refcount() == 2`. >>> + pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { >>> + // SAFETY: By type invariant, `self.0` is a valid `struct request`. By >>> + // existence of `&mut self` we have exclusive access. >> >> We don't have a `&mut self`. But the safety requirements ask for a >> unique `ARef`. > > Thanks, I'll rephrase to: > > // SAFETY: By type invariant, `self.0` is a valid `struct request` and > // we have exclusive access. > >> >>> + unsafe { bindings::blk_mq_start_request(this.0.get()) }; >>> + } >>> + >>> + fn try_set_end(this: ARef<Self>) -> Result<ARef<Self>, ARef<Self>> { >>> + // We can race with `TagSet::tag_to_rq` >>> + match this.wrapper_ref().refcount().compare_exchange( >>> + 2, >>> + 0, >>> + Ordering::Relaxed, >>> + Ordering::Relaxed, >>> + ) { >>> + Err(_old) => Err(this), >>> + Ok(_) => Ok(this), >>> + } >>> + } >>> + >>> + /// Notify the block layer that the request has been completed without errors. >>> + /// >>> + /// This function will return `Err` if `this` is not the only `ARef` >>> + /// referencing the request. >>> + pub fn end_ok(this: ARef<Self>) -> Result<(), ARef<Self>> { >>> + let this = Self::try_set_end(this)?; >>> + let request_ptr = this.0.get(); >>> + core::mem::forget(this); >>> + >>> + // SAFETY: By type invariant, `self.0` is a valid `struct request`. By >>> + // existence of `&mut self` we have exclusive access. >> >> Same here, but in this case, the `ARef` is unique, since you called >> `try_set_end`. You could make it a `# Guarantee` of `try_set_end`: "If >> `Ok(aref)` is returned, then the `aref` is unique." > > Makes sense. I have not seen `# Guarantee` used anywhere. Do you have a link for that use? Alice used it a couple of times, eg in [2]. I plan on putting it in the safety standard. [2]: https://lore.kernel.org/rust-for-linux/20230601134946.3887870-2-aliceryhl@xxxxxxxxxx/ --- Cheers, Benno