Re: [PATCH v5 3/9] rust: file: add Rust abstraction for `struct file`

Christian Brauner <brauner@xxxxxxxxxx> · Sun, 31 Mar 2024 14:55:13 +0200

On Wed, Mar 20, 2024 at 06:09:05PM +0000, Alice Ryhl wrote:
> Christian Brauner <brauner@xxxxxxxxxx> wrote:
> > On Fri, Feb 09, 2024 at 11:18:16AM +0000, Alice Ryhl wrote:
> >> From: Wedson Almeida Filho <wedsonaf@xxxxxxxxx>
> >> 
> >> This abstraction makes it possible to manipulate the open files for a
> >> process. The new `File` struct wraps the C `struct file`. When accessing
> >> it using the smart pointer `ARef<File>`, the pointer will own a
> >> reference count to the file. When accessing it as `&File`, then the
> >> reference does not own a refcount, but the borrow checker will ensure
> >> that the reference count does not hit zero while the `&File` is live.
> >> 
> >> Since this is intended to manipulate the open files of a process, we
> >> introduce an `fget` constructor that corresponds to the C `fget`
> >> method. In future patches, it will become possible to create a new fd in
> >> a process and bind it to a `File`. Rust Binder will use these to send
> >> fds from one process to another.
> >> 
> >> We also provide a method for accessing the file's flags. Rust Binder
> >> will use this to access the flags of the Binder fd to check whether the
> >> non-blocking flag is set, which affects what the Binder ioctl does.
> >> 
> >> This introduces a struct for the EBADF error type, rather than just
> >> using the Error type directly. This has two advantages:
> >> * `File::from_fd` returns a `Result<ARef<File>, BadFdError>`, which the
> > 
> > Sorry, where's that method?
> 
> Sorry, this is supposed to say `File::fget`.
> 
> >> +/// Wraps the kernel's `struct file`.
> >> +///
> >> +/// This represents an open file rather than a file on a filesystem. Processes generally reference
> >> +/// open files using file descriptors. However, file descriptors are not the same as files. A file
> >> +/// descriptor is just an integer that corresponds to a file, and a single file may be referenced
> >> +/// by multiple file descriptors.
> >> +///
> >> +/// # Refcounting
> >> +///
> >> +/// Instances of this type are reference-counted. The reference count is incremented by the
> >> +/// `fget`/`get_file` functions and decremented by `fput`. The Rust type `ARef<File>` represents a
> >> +/// pointer that owns a reference count on the file.
> >> +///
> >> +/// Whenever a process opens a file descriptor (fd), it stores a pointer to the file in its `struct
> >> +/// files_struct`. This pointer owns a reference count to the file, ensuring the file isn't
> >> +/// prematurely deleted while the file descriptor is open. In Rust terminology, the pointers in
> >> +/// `struct files_struct` are `ARef<File>` pointers.
> >> +///
> >> +/// ## Light refcounts
> >> +///
> >> +/// Whenever a process has an fd to a file, it may use something called a "light refcount" as a
> >> +/// performance optimization. Light refcounts are acquired by calling `fdget` and released with
> >> +/// `fdput`. The idea behind light refcounts is that if the fd is not closed between the calls to
> >> +/// `fdget` and `fdput`, then the refcount cannot hit zero during that time, as the `struct
> >> +/// files_struct` holds a reference until the fd is closed. This means that it's safe to access the
> >> +/// file even if `fdget` does not increment the refcount.
> >> +///
> >> +/// The requirement that the fd is not closed during a light refcount applies globally across all
> >> +/// threads - not just on the thread using the light refcount. For this reason, light refcounts are
> >> +/// only used when the `struct files_struct` is not shared with other threads, since this ensures
> >> +/// that other unrelated threads cannot suddenly start using the fd and close it. Therefore,
> >> +/// calling `fdget` on a shared `struct files_struct` creates a normal refcount instead of a light
> >> +/// refcount.
> > 
> > When the fdget() calling task doesn't have a shared file descriptor
> > table fdget() will not increment the reference count, yes. This
> > also implies that you cannot have task A use fdget() and then pass
> > f.file to task B that holds on to it while A returns to userspace. It's
> > irrelevant that task A won't drop the reference count or that task B
> > won't drop the reference count. Because task A could return back to
> > userspace and immediately close the fd via a regular close() system call
> > at which point task B has a UAF. In other words a file that has been
> > gotten via fdget() can't be Send to another task without the Send
> > implying taking a reference to it.
> 
> That matches my understanding.
> 
> I suppose that technically you can still send it to another thread *if* you
> ensure that the current thread waits until that other thread stops using the
> file before returning to userspace.

_Technically_ yes, but it would be brittle as hell. The problem is that
fdget() _relies_ on being single-threaded for the time that fd is used
until fdput(). There's locking assumptions that build on that e.g., for
concurrent read/write. So no, that shouldn't be allowed.

Look at how this broke our back when we introduced pidfd_getfd() where
we steal an fd from another task. I have a lengthy explanation how that
can be used to violate our elided-locking which is based on assuming
that we're always single-threaded and the file can't be suddenly shared
with another task. So maybe doable but it would make the semantics even
more intricate.

> >> +///
> >> +/// Light reference counts must be released with `fdput` before the system call returns to
> >> +/// userspace. This means that if you wait until the current system call returns to userspace, then
> >> +/// all light refcounts that existed at the time have gone away.
> >> +///
> >> +/// ## Rust references
> >> +///
> >> +/// The reference type `&File` is similar to light refcounts:
> >> +///
> >> +/// * `&File` references don't own a reference count. They can only exist as long as the reference
> >> +///   count stays positive, and can only be created when there is some mechanism in place to ensure
> >> +///   this.
> >> +///
> >> +/// * The Rust borrow-checker normally ensures this by enforcing that the `ARef<File>` from which
> >> +///   a `&File` is created outlives the `&File`.
> > 
> > The section confuses me a little: Does the borrow-checker always ensure
> > that a &File stays valid or are there circumstances where it doesn't or
> > are you saying it doesn't enforce it?
> 
> The borrow-checker always ensures it.

Ok, thanks.

> 
> A &File is actually short-hand for &'a File, where 'a is some
> unspecified lifetime. We say that &'a File is annotated with 'a. The
> borrow-checker rejects any code that tries to use a reference after the
> end of the lifetime annotated on it.

Thanks for the explanation.

> 
> So as long as you annotate the reference with a sufficiently short
> lifetime, the borrow checker will prevent UAF. And outside of cases like

Sorry, but can you explain "sufficiently short lifetime"?

> from_ptr, the borrow-checker also takes care of ensuring that the
> lifetimes are sufficiently short.
> 
> (Technically &'a File and &'b File are two completely different types,
> so &File is technically a class of types and not a single type. Rust
> will automatically convert &'long File to &'short File.)
> 
> >> +///
> >> +/// * Using the unsafe [`File::from_ptr`] means that it is up to the caller to ensure that the
> >> +///   `&File` only exists while the reference count is positive.
> > 
> > What is this used for in binder? If it's not used don't add it.
> 
> This is used on the boundary between the Rust part of Binder and the
> binderfs component that is implemented in C. For example:

I see, I'm being foiled by my own code...

> 
> 	unsafe extern "C" fn rust_binder_open(
> 	    inode: *mut bindings::inode,
> 	    file_ptr: *mut bindings::file,
> 	) -> core::ffi::c_int {
> 	    // SAFETY: The `rust_binderfs.c` file ensures that `i_private` is set to the return value of a
> 	    // successful call to `rust_binder_new_device`.
> 	    let ctx = unsafe { Arc::<Context>::borrow((*inode).i_private) };
> 	
> 	    // SAFETY: The caller provides a valid file pointer to a new `struct file`.
> 	    let file = unsafe { File::from_ptr(file_ptr) };

We need a better name for this helper than from_ptr() imho. I think
from_ptr() and as_ptr() is odd for C. How weird would it be to call
that from_raw_file() and as_raw_file()?

But bigger picture I somewhat struggle with the semantics of this
because this is not an interface that we have in C and this is really
about a low-level contract between C and Rust. Specifically this states
that this pointer is _somehow_ guaranteed valid. And really, this seems
a bit of a hack.

Naively, I think this should probably not be necessary if
file_operations are properly wrapped. Or it should at least be demotable
to a purely internal method that can't be called directly or something.

So what I mean is. fdget() may or may not take a reference. The C
interface internally knows whether a reference is owned or not by
abusing the lower two bits in a pointer to keep track of that. Naively,
I would expect the same information to be available to rust so that it's
clear to Rust wheter it's dealing with an explicitly referenced file or
an elided-reference file. Maybe that's not possible and I'm not
well-versed enough to see that yet.

> 	    let process = match Process::open(ctx, file) {
> 	        Ok(process) => process,
> 	        Err(err) => return err.to_errno(),
> 	    };
> 	    // SAFETY: This file is associated with Rust binder, so we own the `private_data` field.
> 	    unsafe {
> 	        (*file_ptr).private_data = process.into_foreign().cast_mut();
> 	    }
> 	    0
> 	}
> 
> Here, rust_binder_open is the open function in a struct file_operations
> vtable. In this case, file_ptr is guaranteed by the caller to be valid

Where's the code that wraps struct file_operations?

> for the duration of the call to rust_binder_open. Binder uses from_ptr
> to get a &File from the raw pointer.
> 
> As far as I understand, the caller of rust_binder_open uses fdget to
> ensure that file_ptr is valid for the duration of the call, but the Rust
> code doesn't assume that it does this with fdget. As long as file_ptr is
> valid for the duration of the rust_binder_open call, this use of
> from_ptr is okay. It will continue to work even if the caller is changed
> to use fget.

Ok.

> 
> As for how this code ensures that `file` ends up annotated with a
> sufficiently short lifetime, well, that has to do with the signature of
> Process::open. Here it is:
> 
> 	impl Process {
> 	    pub(crate) fn open(ctx: ArcBorrow<'_, Context>, file: &File) -> Result<Arc<Process>> {
> 	        Self::new(ctx.into(), ARef::from(file.cred()))
> 	    }
> 	}
> 
> In this case, &File is used without specifying a lifetime. It's a
> function argument, so this means that the lifetime annotated on the
> `file` argument will be exactly the duration in which Process::open is
> called. So any attempt to use `file` after the end of the call to
> Process::open will be rejected by the borrow-checker. (E.g., if
> Process::open tried to schedule something on the workqueue using `file`,
> then that would not compile. Storing it in a global variable would not
> compile either.)
> 
> This means that the borrow-checker will not catch mistakes in
> rust_binder_open, but it *will* catch mistakes in Process::open, and
> anything called by Process::open.
> 
> These examples come from patch 2 of the Binder RFC:
> https://lore.kernel.org/rust-for-linux/20231101-rust-binder-v1-2-08ba9197f637@xxxxxxxxxx/
> 
> >> +///
> >> +/// * You can think of `fdget` as using an fd to look up an `ARef<File>` in the `struct
> > 
> > Could you explain why there isn't an explicit fdget() then and you have
> > that from_ptr() method?
> 
> I don't provide an fdget implementation because Binder never calls it.
> However, if you would like, I would be happy to add one to the patchset.
> 
> As for from_ptr, see above.
> 
> Alice