Hi, This looks interesting! I have some questions: 1. What is the ubdsrv permission model? A big usability challenge for *-in-userspace interfaces is the balance between security and allowing unprivileged processes to use these features. - Does /dev/ubd-control need to be privileged? I guess the answer is yes since an evil ubdsrv can hang I/O and corrupt data in hopes of triggering file system bugs. - Can multiple processes that don't trust each other use UBD at the same time? I guess not since ubd_index_idr is global. - What about containers and namespaces? They currently have (write) access to the same global ubd_index_idr. - Maybe there should be a struct ubd_device "owner" (struct task_struct *) so only devices created by the current process can be modified? 2. io_uring_cmd design The rationale for the io_uring_cmd design is not explained in the cover letter. I think it's worth explaining the design. Here are my guesses: The same thing can be achieved with just file_operations and io_uring. ubdsrv could read I/O submissions with IORING_OP_READ and write I/O completions with IORING_OP_WRITE. That would require 2 sqes per roundtrip instead of 1, but the same number of io_uring_enter(2) calls since multiple sqes/cqes can be batched per syscall: - IORING_OP_READ, addr=(struct ubdsrv_io_desc*) (for submission) - IORING_OP_WRITE, addr=(struct ubdsrv_io_cmd*) (for completion) Both operations require a copy_to/from_user() to access the command metadata. The io_uring_cmd approach works differently. The IORING_OP_URING_CMD sqe carries a 40-byte payload so it's possible to embed struct ubdsrv_io_cmd inside it. The struct ubdsrv_io_desc mmap gets around the fact that io_uring cqes contain no payload. The driver therefore needs a side-channel to transfer the request submission details to ubdsrv. I don't see much of a difference between IORING_OP_READ and the mmap approach though. It's not obvious to me how much more efficient the io_uring_cmd approach is, but taking fewer trips around the io_uring submission/completion code path is likely to be faster. Something similar can be done with file_operations ->ioctl(), but I guess the point of using io_uring is that is composes. If ubdsrv itself wants to use io_uring for other I/O activity (e.g. networking, disk I/O, etc) then it can do so and won't be stuck in a blocking ioctl() syscall. It would be nice if you could write 2 or 3 paragraphs explaining why the io_uring_cmd design and the struct ubdsrv_io_desc mmap was chosen. 3. Miscellaneous stuff - There isn't much in the way of memory ordering in the code. I worry a little that changes to the struct ubdsrv_io_desc mmap may not be visible at the expected time with respect to the io_uring cq ring. Thanks, Stefan
Attachment:
signature.asc
Description: PGP signature