On 03.04.24 10:46, Andreas Hindborg wrote: > Benno Lossin <benno.lossin@xxxxxxxxx> writes: > >> On 23.03.24 07:32, Andreas Hindborg wrote: >>> Benno Lossin <benno.lossin@xxxxxxxxx> writes: >>>> On 3/13/24 12:05, Andreas Hindborg wrote: >>>>> +//! implementations of the `Operations` trait. >>>>> +//! >>>>> +//! IO requests are passed to the driver as [`Request`] references. The >>>>> +//! `Request` type is a wrapper around the C `struct request`. The driver must >>>>> +//! mark start of request processing by calling [`Request::start`] and end of >>>>> +//! processing by calling one of the [`Request::end`], methods. Failure to do so >>>>> +//! can lead to IO failures. >>>> >>>> I am unfamiliar with this, what are "IO failures"? >>>> Do you think that it might be better to change the API to use a >>>> callback? So instead of calling start and end, you would do >>>> >>>> request.handle(|req| { >>>> // do the stuff that would be done between start and end >>>> }); >>>> >>>> I took a quick look at the rnull driver and there you are calling >>>> `Request::end_ok` from a different function. So my suggestion might not >>>> be possible, since you really need the freedom. >>>> >>>> Do you think that a guard approach might work better? ie `start` returns >>>> a guard that when dropped will call `end` and you need the guard to >>>> operate on the request. >>> >>> I don't think that would fit, since the driver might not complete the >>> request immediately. We might be able to call `start` on behalf of the >>> driver. >>> >>> At any rate, since the request is reference counted now, we can >>> automatically fail a request when the last reference is dropped and it >>> was not marked successfully completed. I would need to measure the >>> performance implications of such a feature. >> >> Are there cases where you still need access to the request after you >> have called `end`? > > In general no, there is no need to handle the request after calling end. > C drivers are not allowed to, because this transfers ownership of the > request back to the block layer. This patch series defer the transfer of > ownership to the point when the ARef<Request> refcount goes to zero, so > there should be no danger associated with touching the `Request` after > end. > >> If no, I think it would be better for the request to >> be consumed by the `end` function. >> This is a bit difficult with `ARef`, since the user can just clone it >> though... Do you think that it might be necessary to clone requests? > > Looking into the details now I see that calling `Request::end` more than > once will trigger UAF, because C code decrements the refcount on the > request. When we have `ARef<Request>` around, that is a problem. It > probably also messes with other things in C land. Good catch. > > I did implement `Request::end` to consume the request at one point > before I fell back on reference counting. It works fine for simple > drivers. However, most drivers will need to use the block layer tag set > service, that allows conversion of an integer id to a request pointer. > The abstraction for this feature is not part of this patch set. But the > block layer manages a mapping of integer to request mapping, and drivers > typically use this to identify the request that corresponds to > completion messages that arrive from hardware. When drivers are able to > turn integers into requests like this, consuming the request in the call > to `end` makes little sense (because we can just construct more). How do you ensure that this is fine?: let r1 = tagset.get(0); let r2 = tagset.get(0); r1.end_ok(); r2.do_something_that_would_only_be_done_while_active(); One thing that comes to my mind would be to only give out `&Request` from the tag set. And to destroy, you could have a separate operation that also removes the request from the tag set. (I am thinking of a tag set as a `HashMap<u64, Request>`. > > What I do now is issue the an `Option<ARef<Request>>` with > `bindings::req_ref_inc_not_zero(rq_ptr)`, to make sure that the request > is currently owned by the driver. > > I guess we can check the absolute value of the refcount, and only issue > a request handle if the count matches what we expect. Then we can be certain > that the handle is unique, and we can require transfer of ownership of > the handle to `Request::end` to make sure it can never be called more > than once. > > Another option is to error out in `Request::end` if the > refcount is not what we expect. I am a bit confused, why does the refcount matter in this case? Can't the user just have multiple `ARef`s? I think it would be weird to use `ARef<Request>` if you expect the refcount to be 1. Maybe the API should be different? As I understand it, a request has the following life cycle (please correct me if I am wrong): 1. A new request is created, it is given to the driver via `queue_rq`. 2. The driver can now decide what to do with it (theoretically it can store it somewhere and later do something with it), but it should at some point call `Request::start`. 3. Work happens and eventually the driver calls `Request::end`. To me this does not seem like something where we need a refcount (we still might need one for safety, but it does not need to be exposed to the user). -- Cheers, Benno