Re: [PATCH] files: rcu free files_struct

Christian Brauner <christian.brauner@xxxxxxxxxx> · Thu, 10 Dec 2020 23:30:24 +0100

On Thu, Dec 10, 2020 at 09:36:24PM +0000, Al Viro wrote:
> On Thu, Dec 10, 2020 at 01:29:01PM -0600, Eric W. Biederman wrote:
> > Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes:
> 
> > > What are the users of that thing and is there any chance to replace it
> > > with something saner?  IOW, what *is* realistically called for each
> > > struct file by the users of that iterator?
> > 
> > The bpf guys are no longer Cc'd and they can probably answer better than
> > I.
> > 
> > In a previous conversation it was mentioned that task_iter was supposed
> > to be a high performance interface for getting proc like data out of the
> > kernel using bpf.
> > 
> > If so I think that handles the lifetime issues as bpf programs are
> > supposed to be short-lived and can not pass references anywhere.
> > 
> > On the flip side it raises the question did the BPF guys just make the
> > current layout of task_struct and struct file part of the linux kernel
> > user space ABI?
> 
> An interesting question, that...  For the record: anybody coming to

Imho, they did. An example from the BPF LSM: a few weeks ago someone
asked me whether it would be possible to use the BPF LSM to enforce you
can't open files when they are on a given filesystem. Sine this bpf lsm
allows to attach to lsm hooks, say security_file_open(), you can get at
the superblock and check the filesyste type in a bpf program
(requiring btf), i.e. security_file_open, then follow
file->f_inode->i_sb->s_type->s_magic. If we change the say struct
super_block I'd expect these bpf programs to break. I'm sure there's
something clever that they came up with but it is nonetheless
uncomfortably close to making internal kernel structures part of
userspace ABI indeed.

> complain about a removed/renamed/replaced with something else field
> in struct file will be refered to Figure 1.
> 
> None of the VFS data structures has any layout stability warranties.
> If BPF folks want access to something in that, they are welcome to come
> and discuss the set of accessors; so far nothing of that sort has happened.
> 
> Direct access to any fields of any of those structures is subject to
> being broken at zero notice.
> 
> IMO we need some notation for a structure being off-limits for BPF, tracing,
> etc., along the lines of "don't access any field directly".

Indeed. I would also like to see a list where changes need to be sent
that are technically specific to a subsystem but will necessarily have
kernel-wide impact prime example: a lot of bpf.

Christian