Re: [PATCH ghak90 V8 02/16] audit: add container id

Richard Guy Briggs <rgb@xxxxxxxxxx> · Thu, 30 Jan 2020 12:53:46 -0500

On 2020-01-22 16:28, Paul Moore wrote:
> On Tue, Dec 31, 2019 at 2:49 PM Richard Guy Briggs <rgb@xxxxxxxxxx> wrote:
> >
> > Implement the proc fs write to set the audit container identifier of a
> > process, emitting an AUDIT_CONTAINER_OP record to document the event.
> >
> > This is a write from the container orchestrator task to a proc entry of
> > the form /proc/PID/audit_containerid where PID is the process ID of the
> > newly created task that is to become the first task in a container, or
> > an additional task added to a container.
> >
> > The write expects up to a u64 value (unset: 18446744073709551615).
> >
> > The writer must have capability CAP_AUDIT_CONTROL.
> >
> > This will produce a record such as this:
> >   type=CONTAINER_OP msg=audit(2018-06-06 12:39:29.636:26949) : op=set opid=2209 contid=123456 old-contid=18446744073709551615
> >
> > The "op" field indicates an initial set.  The "opid" field is the
> > object's PID, the process being "contained".  New and old audit
> > container identifier values are given in the "contid" fields.
> >
> > It is not permitted to unset the audit container identifier.
> > A child inherits its parent's audit container identifier.
> >
> > Please see the github audit kernel issue for the main feature:
> >   https://github.com/linux-audit/audit-kernel/issues/90
> > Please see the github audit userspace issue for supporting additions:
> >   https://github.com/linux-audit/audit-userspace/issues/51
> > Please see the github audit testsuiite issue for the test case:
> >   https://github.com/linux-audit/audit-testsuite/issues/64
> > Please see the github audit wiki for the feature overview:
> >   https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> >
> > Signed-off-by: Richard Guy Briggs <rgb@xxxxxxxxxx>
> > Acked-by: Serge Hallyn <serge@xxxxxxxxxx>
> > Acked-by: Steve Grubb <sgrubb@xxxxxxxxxx>
> > Acked-by: Neil Horman <nhorman@xxxxxxxxxxxxx>
> > Reviewed-by: Ondrej Mosnacek <omosnace@xxxxxxxxxx>
> > Signed-off-by: Richard Guy Briggs <rgb@xxxxxxxxxx>
> > ---
> >  fs/proc/base.c             | 36 ++++++++++++++++++++++++++++
> >  include/linux/audit.h      | 25 ++++++++++++++++++++
> >  include/uapi/linux/audit.h |  2 ++
> >  kernel/audit.c             | 58 ++++++++++++++++++++++++++++++++++++++++++++++
> >  kernel/audit.h             |  1 +
> >  kernel/auditsc.c           |  4 ++++
> >  6 files changed, 126 insertions(+)
> 
> ...
> 
> > diff --git a/kernel/audit.c b/kernel/audit.c
> > index 397f8fb4836a..2d7707426b7d 100644
> > --- a/kernel/audit.c
> > +++ b/kernel/audit.c
> > @@ -2356,6 +2358,62 @@ int audit_signal_info(int sig, struct task_struct *t)
> >         return audit_signal_info_syscall(t);
> >  }
> >
> > +/*
> > + * audit_set_contid - set current task's audit contid
> > + * @task: target task
> > + * @contid: contid value
> > + *
> > + * Returns 0 on success, -EPERM on permission failure.
> > + *
> > + * Called (set) from fs/proc/base.c::proc_contid_write().
> > + */
> > +int audit_set_contid(struct task_struct *task, u64 contid)
> > +{
> > +       u64 oldcontid;
> > +       int rc = 0;
> > +       struct audit_buffer *ab;
> > +
> > +       task_lock(task);
> > +       /* Can't set if audit disabled */
> > +       if (!task->audit) {
> > +               task_unlock(task);
> > +               return -ENOPROTOOPT;
> > +       }
> > +       oldcontid = audit_get_contid(task);
> > +       read_lock(&tasklist_lock);
> > +       /* Don't allow the audit containerid to be unset */
> > +       if (!audit_contid_valid(contid))
> > +               rc = -EINVAL;
> > +       /* if we don't have caps, reject */
> > +       else if (!capable(CAP_AUDIT_CONTROL))
> > +               rc = -EPERM;
> > +       /* if task has children or is not single-threaded, deny */
> > +       else if (!list_empty(&task->children))
> > +               rc = -EBUSY;
> > +       else if (!(thread_group_leader(task) && thread_group_empty(task)))
> > +               rc = -EALREADY;
> 
> [NOTE: there is a bigger issue below which I think is going to require
> a respin/fixup of this patch so I'm going to take the opportunity to
> do a bit more bikeshedding ;)]
> 
> It seems like we could combine both the thread/children checks under a
> single -EBUSY return value.  In both cases the caller should be able
> to determine if the target process is multi-threaded for has spawned
> children, yes?  FWIW, my motivation for this question is that
> -EALREADY seems like a poor choice here.

Fair enough.

> > +       /* if contid is already set, deny */
> > +       else if (audit_contid_set(task))
> > +               rc = -ECHILD;
> 
> Does -EEXIST make more sense here?

Perhaps.  I don't feel strongly about it, but none of these error codes
were intended for this use and should not overlap with other errors from
writing to /proc.

> > +       read_unlock(&tasklist_lock);
> > +       if (!rc)
> > +               task->audit->contid = contid;
> > +       task_unlock(task);
> > +
> > +       if (!audit_enabled)
> > +               return rc;
> > +
> > +       ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_CONTAINER_OP);
> > +       if (!ab)
> > +               return rc;
> > +
> > +       audit_log_format(ab,
> > +                        "op=set opid=%d contid=%llu old-contid=%llu",
> > +                        task_tgid_nr(task), contid, oldcontid);
> > +       audit_log_end(ab);
> 
> Assuming audit is enabled we always emit the record above, even if we
> were not actually able to set the Audit Container ID (ACID); this
> seems wrong to me.  I think the proper behavior would be to either add
> a "res=" field to indicate success/failure or only emit the record
> when we actually change a task's ACID.  Considering the impact that
> the ACID value will potentially have on the audit stream, it seems
> like always logging the record and including a "res=" field may be the
> safer choice.

This record should be accompanied by a syscall record (and eventually
possibly a CONTAINER_ID record of the orchestrator, if it is already in
a container).  The syscall record has a res= field that already gives
this result.

> > +       return rc;
> > +}
> > +
> >  /**
> >   * audit_log_end - end one audit record
> >   * @ab: the audit_buffer
> 
> --
> paul moore
> www.paul-moore.com
> 

- RGB

--
Richard Guy Briggs <rgb@xxxxxxxxxx>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635