Re: [PATCH RFC v2 2/3] io_uring: add IOURING_REGISTER_RESTRICTIONS opcode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 22, 2020 at 12:35:15PM +1000, Daurnimator wrote:
> On Wed, 22 Jul 2020 at 03:11, Jens Axboe <axboe@xxxxxxxxx> wrote:
> >
> > On 7/21/20 4:40 AM, Stefano Garzarella wrote:
> > > On Thu, Jul 16, 2020 at 03:26:51PM -0600, Jens Axboe wrote:
> > >> On 7/16/20 6:48 AM, Stefano Garzarella wrote:
> > >>> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> > >>> index efc50bd0af34..0774d5382c65 100644
> > >>> --- a/include/uapi/linux/io_uring.h
> > >>> +++ b/include/uapi/linux/io_uring.h
> > >>> @@ -265,6 +265,7 @@ enum {
> > >>>     IORING_REGISTER_PROBE,
> > >>>     IORING_REGISTER_PERSONALITY,
> > >>>     IORING_UNREGISTER_PERSONALITY,
> > >>> +   IORING_REGISTER_RESTRICTIONS,
> > >>>
> > >>>     /* this goes last */
> > >>>     IORING_REGISTER_LAST
> > >>> @@ -293,4 +294,30 @@ struct io_uring_probe {
> > >>>     struct io_uring_probe_op ops[0];
> > >>>  };
> > >>>
> > >>> +struct io_uring_restriction {
> > >>> +   __u16 opcode;
> > >>> +   union {
> > >>> +           __u8 register_op; /* IORING_RESTRICTION_REGISTER_OP */
> > >>> +           __u8 sqe_op;      /* IORING_RESTRICTION_SQE_OP */
> > >>> +   };
> > >>> +   __u8 resv;
> > >>> +   __u32 resv2[3];
> > >>> +};
> > >>> +
> > >>> +/*
> > >>> + * io_uring_restriction->opcode values
> > >>> + */
> > >>> +enum {
> > >>> +   /* Allow an io_uring_register(2) opcode */
> > >>> +   IORING_RESTRICTION_REGISTER_OP,
> > >>> +
> > >>> +   /* Allow an sqe opcode */
> > >>> +   IORING_RESTRICTION_SQE_OP,
> > >>> +
> > >>> +   /* Only allow fixed files */
> > >>> +   IORING_RESTRICTION_FIXED_FILES_ONLY,
> > >>> +
> > >>> +   IORING_RESTRICTION_LAST
> > >>> +};
> > >>> +
> > >>
> > >> Not sure I totally love this API. Maybe it'd be cleaner to have separate
> > >> ops for this, instead of muxing it like this. One for registering op
> > >> code restrictions, and one for disallowing other parts (like fixed
> > >> files, etc).
> > >>
> > >> I think that would look a lot cleaner than the above.
> > >>
> > >
> > > Talking with Stefan, an alternative, maybe more near to your suggestion,
> > > would be to remove the 'struct io_uring_restriction' and add the
> > > following register ops:
> > >
> > >     /* Allow an sqe opcode */
> > >     IORING_REGISTER_RESTRICTION_SQE_OP
> > >
> > >     /* Allow an io_uring_register(2) opcode */
> > >     IORING_REGISTER_RESTRICTION_REG_OP
> > >
> > >     /* Register IORING_RESTRICTION_*  */
> > >     IORING_REGISTER_RESTRICTION_OP
> > >
> > >
> > >     enum {
> > >         /* Only allow fixed files */
> > >         IORING_RESTRICTION_FIXED_FILES_ONLY,
> > >
> > >         IORING_RESTRICTION_LAST
> > >     }
> > >
> > >
> > > We can also enable restriction only when the rings started, to avoid to
> > > register IORING_REGISTER_ENABLE_RINGS opcode. Once rings are started,
> > > the restrictions cannot be changed or disabled.
> >
> > My concerns are largely:
> >
> > 1) An API that's straight forward to use
> > 2) Something that'll work with future changes
> >
> > The "allow these opcodes" is straightforward, and ditto for the register
> > opcodes. The fixed file I guess is the odd one out. So if we need to
> > disallow things in the future, we'll need to add a new restriction
> > sub-op. Should this perhaps be "these flags must be set", and that could
> > easily be augmented with "these flags must not be set"?
> >
> > --
> > Jens Axboe
> >
> 
> This is starting to sound a lot like seccomp filtering.
> Perhaps we should go straight to adding a BPF hook that fires when
> reading off the submission queue?
> 

You're right. I e-mailed about that whit Kees Cook [1] and he agreed that the
restrictions in io_uring should allow us to address some issues that with
seccomp it's a bit difficult. For example:
- different restrictions for different io_uring instances in the same
  process
- limit SQEs to use only registered fds and buffers

Maybe seccomp could take advantage of the restrictions to filter SQEs opcodes.

Thanks,
Stefano

[1] https://lore.kernel.org/io-uring/202007160751.ED56C55@keescook/




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux