Re: Making AppArmor work with new mount context API

John Johansen <john.johansen@xxxxxxxxxxxxx> · Wed, 10 Jan 2018 04:05:10 -0800

On 01/09/2018 08:37 AM, David Howells wrote:
> Hi John,
> 
> I've been having a look at making AppArmor work with the new mount API, the
> basic infrastructure for which can be found here:
> 
> 	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=mount-context
> 
> but this doesn't work for AppArmor.  Unfortunately, I've come across a tricky
> problem that I'm not sure how to solve, but maybe you can help with.
> 
> The issue involves how apparmor evaluates a DFA in do_match_mnt().  The DFA is
> fed the following things in the order:
> 
>  (1) Mountpoint path.
> 
>  (2) Device name.
> 
>  (3) File system type.
> 
>  (4) MS_* flags.
> 
>  (5) Non-binary mount data string (ie. mount options).
> 

yep

> However, what I want to do is make it so that:
> 
>  (1) MNT_* flags are differentiated from SB_* flags (ie. MS_* flags don't get
>      past the syscall interface).
> 
makes sense

>  (2) The superblock is set up as a separate step from the mounting step, prior
>      to the mounting step and then the mounting step may be repeated multiple
>      times for a given context, eg.;
> 
> 	fd = fsopen("ext4");
> 	write(fd, "d /dev/sda2");
> 	write(fd, "o user_xattr");
> 	write(fd, "o acl");
> 	write(fd, "o data=ordered");
> 	write(fd, "x create");
> 	// We now have a superblock and we can now query fs data
> 	write(fd, "q block_size");
> 	read(fd, query_buffer);
> 	// Mount three times
> 	fsmount(fd, "/mnt/a", MNT_NODEV | MNT_NOEXEC);
> 	fsmount(fd, "/mnt/b", 0);
> 	fsmount(fd, "/mnt/c", 0);
> 	close(fd);
> 
okay

>  (3) We can pick an existing object and then create a bind mount by something
>      like:
> 
> 	fd = fspick("/mnt/a");
> 	fsmount(fd, "/mnt/d");
> 	close(fd);
> 
>  (4) We can reconfigure (~= remount) a superblock by picking it, setting
>      parameters and then executing the reconfigure, eg.:
> 
> 	fd = fspick("/mnt/a");
> 	write(fd, "o ro");
> 	write(fd, "x reconfigure");
> 	close(fd);
> 
>      (Note that I want to use this opportunity to make reconfiguration atomic
>       with all the parameter error checking being done first).
> >  (5) Any particular option value may exceed a page in size and the total set
>      of options may exceed a page in size.  We can do this because the options
>      are delivered with write() and then parsed immediately.
> 
>  (6) Any particular option could be binary rather than text.  write() is given
>      the size of the blob, so we don't need to guess.  This feature could be
>      used to pass authentication data, say, for a network fs.
> 
> So the order in which Apparmor evaluates its DFA is really inconvenient.
> Ideally it would be:
> 
>  (1) Filesystem type.
> 
>  (2) Device name.
> 
>  (3) SB_* flags.
> 
>  (4) Options, passed one at a time (it doesn't look like it would be a problem
>      to save the intermediate DFA state).
> 

right, there are a couple places where we store and restart based off
of a saved state

> save this DFA state, then for each mount attempt, begin with the saved state
> and add:
> 
>  (5) Mount path.
> 
>  (6) MNT_* flags.
> 
> I don't suppose the DFA is created such that it doesn't matter what the order
> is?

it isn't atm, but the ability reorder and change the parameters set is
planned for to provide extended conditional support. So once that
lands a reordering would be possible.

Basically instead of hard coding the match order, policy will get a
vector that will be used to determine the set of conditionals and the
order to match them in.

> I would really like to avoid having to buffer the entire option set as a
> text string just so that Apparmor can parse it for each mount (though this
> could be done inside Apparmor code as the fs_context struct gives the active
> LSM somewhere to save state).
> 

that would be ugly but acceptable as an intermediate solution until I
can land the extended conditionals support.

I am not sure which is the quickest approach to get apparmor out of
the way for the new mount API. I can accelerate the extended
conditional work some by making mount mediation the first to use it,
but my guess is buffering the option set will still be a little
quicker as the extended conditional rework will require both kernel
and policy compiler work.