On Mon, Dec 11, 2023 at 6:23 PM Benno Lossin <benno.lossin@xxxxxxxxx> wrote: > > >>> + unsafe { bindings::init_task_work(callback_head, Some(Self::do_close_fd)) }; > >>> + // SAFETY: The `callback_head` pointer points at a valid and fully initialized task work > >>> + // that is ready to be scheduled. > >>> + // > >>> + // If the task work gets scheduled as-is, then it will be a no-op. However, we will update > >>> + // the file pointer below to tell it which file to fput. > >>> + let res = unsafe { bindings::task_work_add(current, callback_head, TWA_RESUME) }; > >>> + > >>> + if res != 0 { > >>> + // SAFETY: Scheduling the task work failed, so we still have ownership of the box, so > >>> + // we may destroy it. > >>> + unsafe { drop(Box::from_raw(inner)) }; > >>> + > >>> + return Err(DeferredFdCloseError::TaskWorkUnavailable); > >> > >> Just curious, what could make the `task_work_add` call fail? I imagine > >> an OOM situation, but is that all? > > > > Actually, no, this doesn't fail in OOM situations since we pass it an > > allocation for its linked list. It fails only when the current task is > > exiting and wont run task work again. > > Interesting, is there some way to check for this aside from calling > `task_work_add`? I don't think so. We would need to access the `work_exited` constant in `kernel/task_work.c` to do that, but it is not exported. > >> Also, you do not call it when `file` is null, which I imagine to be > >> fine, but I do not know that since the C comment does not cover that > >> case. > > > > Null pointer means that the fd doesn't exist, and it's correct to do > > nothing in that case. > > I would also mention that in a comment (or the SAFETY comment). Okay. > >>> + let file = unsafe { bindings::close_fd_get_file(fd) }; > >>> + if file.is_null() { > >>> + // We don't clean up the task work since that might be expensive if the task work queue > >>> + // is long. Just let it execute and let it clean up for itself. > >>> + return Err(DeferredFdCloseError::BadFd); > >>> + } > >>> + > >>> + // SAFETY: The `file` pointer points at a valid file. > >>> + unsafe { bindings::get_file(file) }; > >>> + > >>> + // SAFETY: Due to the above `get_file`, even if the current task holds an `fdget` to > >>> + // this file right now, the refcount will not drop to zero until after it is released > >>> + // with `fdput`. This is because when using `fdget`, you must always use `fdput` before > >> > >> Shouldn't this be "the refcount will not drop to zero until after it is > >> released with `fput`."? > >> > >> Why is this the SAFETY comment for `filp_close`? I am not understanding > >> the requirement on that function that needs this. This seems more a > >> justification for accessing `file` inside `do_close_fd`. In which case I > >> think it would be better to make it a type invariant of > >> `DeferredFdCloserInner`. > > > > It's because `filp_close` decreases the refcount for the file, and doing > > that is not always safe even if you have a refcount to the file. To drop > > the refcount, at least one of the two following must be the case: > > > > * If the refcount decreases to a non-zero value, then it is okay. > > * If there are no users of `fdget` on the file, then it is okay. > > I see, that makes sense. Is this written down somewhere? Or how does one > know about this? I don't think there's a single place to read about this. The comments on __fget_light allude to something similar, but it makes the blanket statement that you can't call filp_close while an fdget reference exists, even though the reality is a bit more nuanced. > >>> + // We update the file pointer that the task work is supposed to fput. > >>> + // > >>> + // SAFETY: Task works are executed on the current thread once we return to userspace, so > >>> + // this write is guaranteed to happen before `do_close_fd` is called, which means that a > >>> + // race is not possible here. > >>> + // > >>> + // It's okay to pass this pointer to the task work, since we just acquired a refcount with > >>> + // the previous call to `get_file`. Furthermore, the refcount will not drop to zero during > >>> + // an `fdget` call, since we defer the `fput` until after returning to userspace. > >>> + unsafe { *file_field = file }; > >> > >> A synchronization question: who guarantees that this write is actually > >> available to the cpu that executes `do_close_fd`? Is there some > >> synchronization run when returning to userspace? > > > > It's on the same thread, so it's just a sequenced-before relation. > > > > It's not like an interrupt. It runs after the syscall invocation has > > exited, but before it does the actual return-to-userspace stuff. > > Reasonable, can you also put this in a comment? What do you want me to add? I already say that it will be executed on the same thread. > >>> +/// Represents a failure to close an fd in a deferred manner. > >>> +#[derive(Copy, Clone, Eq, PartialEq)] > >>> +pub enum DeferredFdCloseError { > >>> + /// Closing the fd failed because we were unable to schedule a task work. > >>> + TaskWorkUnavailable, > >>> + /// Closing the fd failed because the fd does not exist. > >>> + BadFd, > >>> +} > >>> + > >>> +impl From<DeferredFdCloseError> for Error { > >>> + fn from(err: DeferredFdCloseError) -> Error { > >>> + match err { > >>> + DeferredFdCloseError::TaskWorkUnavailable => ESRCH, > >> > >> This error reads "No such process", I am not sure if that is the best > >> way to express the problem in that situation. I took a quick look at the > >> other error codes, but could not find a better fit. Do you have any > >> better ideas? Or is this the error that C binder uses? > > > > This is the error code that task_work_add returns. (It can't happen in > > Binder.) > > > > And I do think that it is a reasonable choice, because the error only > > happens if you're calling the method from a context that has no > > userspace process associated with it. > > I see. > > What do you think of making the Rust error more descriptive? So instead > of implementing `Debug` like you currently do, you print > > $error ($variant) > > where $error = Error::from(*self) and $variant is the name of the > variant? > > This is more of a general suggestion, I don't think that this error type > in particular warrants this. But in general with Rust we do have the > option to have good error messages for every error while maintaining > efficient error values. I can #[derive(Debug)] instead, I guess? Alice