On 11/22/19 10:09 AM, Stephen Smalley wrote:
On 11/22/19 9:49 AM, Paul Moore wrote:
On Fri, Nov 22, 2019 at 8:37 AM Stephen Smalley <sds@xxxxxxxxxxxxx>
wrote:
On 11/21/19 7:30 PM, Paul Moore wrote:
On Thu, Nov 21, 2019 at 7:12 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
On Thu, Nov 21, 2019 at 9:52 AM Stephen Smalley <sds@xxxxxxxxxxxxx>
wrote:
commit bda0be7ad994 ("security: make inode_follow_link RCU-walk
aware")
passed down the rcu flag to the SELinux AVC, but failed to adjust the
test in slow_avc_audit() to also return -ECHILD on
LSM_AUDIT_DATA_DENTRY.
Previously, we only returned -ECHILD if generating an audit record
with
LSM_AUDIT_DATA_INODE since this was only relevant from
inode_permission.
Return -ECHILD on either LSM_AUDIT_DATA_INODE or
LSM_AUDIT_DATA_DENTRY.
LSM_AUDIT_DATA_INODE only requires this handling due to the fact
that dump_common_audit_data() calls d_find_alias() and collects the
dname from the result if any.
Other cases that might require similar treatment in the future are
LSM_AUDIT_DATA_PATH and LSM_AUDIT_DATA_FILE if any hook that takes
a path or file is called under RCU-walk.
Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk
aware")
Signed-off-by: Stephen Smalley <sds@xxxxxxxxxxxxx>
---
security/selinux/avc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index 74c43ebe34bb..f1fa1072230c 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -779,7 +779,8 @@ noinline int slow_avc_audit(struct
selinux_state *state,
* during retry. However this is logically just as if
the operation
* happened a little later.
*/
- if ((a->type == LSM_AUDIT_DATA_INODE) &&
+ if ((a->type == LSM_AUDIT_DATA_INODE ||
+ a->type == LSM_AUDIT_DATA_DENTRY) &&
(flags & MAY_NOT_BLOCK))
return -ECHILD;
With LSM_AUDIT_DATA_INODE we eventually end up calling d_find_alias()
in dump_common_audit_data() which could block, which is bad, that I
understand. However, looking at LSM_AUDIT_DATA_DENTRY I'm less clear
on why that is bad? It makes a few audit_log*() calls and one call to
d_backing_inode() which is non-blocking and trivial.
What am I missing?
For those who haven't, you may wish to also read the earlier thread here
that led to this one:
https://lore.kernel.org/selinux/20191119184057.14961-1-will@xxxxxxxxxx/T/#t
AFAIK, neither the LSM_AUDIT_DATA_INODE nor the LSM_AUDIT_DATA_DENTRY
case truly block (d_find_alias does not block AFAICT, nor should
audit_log* as long as we use audit_log_start with GFP_ATOMIC or
GFP_NOWAIT).
Yes, the audit_log*() functions should be safe, if not I would
consider that a bug; I thought d_find_alias() might block, but it's
very likely I'm wrong in that regard.
No, it doesn't appear to block. However, it does take d_lock and
increment d_lockref.count, which IIUC aren't permitted during rcu-walk.
My impression from the comment in slow_avc_audit() is that
the issue is not really about blocking but rather about the inability to
safely dereference the dentry->d_name during RCU walk, which is
something that can occur under LSM_AUDIT_DATA_INODE or _DENTRY (or _PATH
or _FILE, but neither of the latter two are currently used from the two
hooks that are called during RCU walk, inode_permission and
inode_follow_link). Originally _PATH, _DENTRY, and _INODE were all
under a single _FS type and the original test in slow_avc_audit() was
against LSM_AUDIT_DATA_FS before the split.
Thanks, I think that is the part I was missing. I was focused too
much on the VFS stuff that I didn't pay enough attention to
slow_avc_audit().
If that is the case, the comment and code in dentry_cmp() would seem
to indicate that it would be safe to fetch the dentry name string as
long as we use READ_ONCE(). The length field in the qstr might be
off, but the audit_log_untrustedstring() function doesn't use the
qstr's length information. I suppose if we don't mind the extra
spinlock we could use take_dentry_name_snapshot(); that should be safe
and we are already in the "slow" path. I didn't check the _PATH or
_FILE cases.
Once again, let me know if I'm missing something.
We can't take any spinlocks on the dentry during rcu-walk IIUC; that
would defeat the purpose. In looking for a parallel with filesystem
implementations, I noted that fs/namei.c:get_link() doesn't even pass
the dentry to the filesystem get_link() method in the rcu-walk case,
only doing so under ref-walk. So they won't permit the filesystem
implementations to ever dereference the dentry for get_link() under
rcu-walk. Not sure why it gets passed to security_inode_follow_link()
then, or if it is ever safe for a security module to dereference its
fields.
I was hoping to get fsdevel folks to comment since I feel like we're
guessing about exactly what guarantees we have in this area.
As an aside, if we somehow can guarantee (e.g. via a name_snapshot)
that qstr length information is valid, we might want to consider
moving from audit_log_unstrustedstring() to
audit_log_n_untrustedstring() to save us a call to strlen().
Added the LSM list as I'm beginning to wonder if we should push this
logic down into common_lsm_audit(), this problem around blocking
shouldn't be SELinux specific.
That would require passing down the MAY_NOT_BLOCK flag or a rcu bool to
common_lsm_audit() just so that it could immediately return if that is
set and a->type is _INODE or _DENTRY (or _PATH or _FILE). And the
individual security module still needs to have its own handling of
MAY_NOT_BLOCK/rcu for its own processing, so it won't free the security
module authors from thinking about it.
Looking at the current SELinux code, all we do is bail out early with
-ECHILD. If we didn't have that check it looks like the only impact
would be some extra assignments into a struct living on the stack and
a call into common_lsm_audit(). That doesn't seem terrible for a slow
path, especially if it pushes this code into a LSM common function.
Not terrible, just not sure if it ends up being a worthwhile change. If
the LSM maintainers would like it that way, I can do that.
I think this rendered moot by viro's suggestion, since we are taking the
handling of MAY_NOT_BLOCK up even earlier in the processing and the
flags don't need to be passed down to slow_avc_audit() anymore. Sure,
we could still pass them down and defer the handling to
common_lsm_audit(), but that's just extra wasted work before we bail
out, and we are no longer even testing the a->type field with the new
logic so there is no longer anything related to the lsm_audit
implementation.
For the LSM folks just joining, the full patchset can be found here:
*
https://lore.kernel.org/selinux/20191121145245.8637-1-sds@xxxxxxxxxxxxx/T/#t