[net-next RFC v2 0/9] Add Checmate: BPF-driven minor LSM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've begun building out the skeleton of a Linux Security Module, and I'd like to
get feedback on it. It's a skeleton, and I've only populated a few hooks, so I'm
mostly looking for input on the general proposal, interest, and design. It's a
minor LSM. My particular use case is one in which containers are being
dynamically deployed to machines by internal developers in a different group.
The point of Checmate is to act as an extensible bed for _safe_, complex
security policies. It's nice to enable dynamic security policies that can be
defined in C, and change as neccessary, without ever having to patch, or rebuild
the kernel.

This is the second reroll of this patchset, and it's quite different than the 
first approach. Instead of being totally independent of the cgroups code, it is 
now a cgroups controller. It relies on the LSM API to hook into points in the 
kernel, and cgroups APIs to determine which policy to enforce. 

Right now, it's meant to be applied to containers. It is expected that it'd be 
configured by some kind of central management system. It's also expected that 
the central management system would have a set of policies that ship as binary 
images, and are controlled by BPF maps. Using this, one can have fairly complex 
filters, without requiring an entire toolchain. Although the patchset currently 
locks BPF programs to only working against the kernel they were compiled with, 
there is nothing in the future that prevents us from changing this.

To start, it only hooks into a subset of the LSM network API. The primary reason 
behind his is simplicity, and rather than build out of the full infrastructure, 
to start the comment process early. Also, there have been a number of patches 
(LandLock, Network cgroups controller, Daniel Mack's BPF filters on cgroups) 
that are similar, and these set of hooks solve many of the same problems.


Although, at first, much of this sounds like seccomp, it's quite different. 
First, you have access to kernel pointers, which allows you to dereference, and 
read data like sockaddrs safely. Since the data has been copied into 
kernelspace, you don't have to worry about TOC-TOU attacks.

The user-facing bits of the API are detailed in "Add LSM / BPF Checmate docs", 
but a short summary is that Checmatate is a cgroups controller. You can enable
it, and then write your BPF FDs to special control files. Once you do this,
the programs are enforced on all processes in that cgroup, and below it.

To answer the question of why not use IPTables - often times, there is an 
overhead to using a 2nd network namespace that is unacceptable. Not because 
network namespaces are inherently expensive, but many of us leverage 
infrastructure that cannot handle multiple IPs, and therefore we have to do
"weird" tricks to get multiple network NSs to work (NAT, mirroring, etc..).

Open Questions:

1) Performance: 

Right now, the patches aren't really performance optimized. For the task hooks, 
it's cheap enough because it's 1 dereference from task->cgroup, and then a 
matter of walking up the hierarchy. On the other hand, for SK's it can be 
considerably more expensive.

I am thinking that maybe it makes sense to add the security hook dynamically the 
first time that someones writes a BPF program to that controller. This way, you 
can have filters on syscalls that happen rarely, like bind, but you avoid
paying the cost on expensive hooks liks rcv_skb.

It would be really nice if sock_cgroup_data included pointers to the CSSs that 
were effective for a given sock.

Also, a minor point. The way that the Checmate struct are packed, we lose 4 
bytes for every hook because of alignment. If we moved counts into the top
level datastructure, we could work around this. I'd prefer not to do that.

2) API

The API right now tightly ties programs to the kernel version. I don't see a 
good way around this unless we decide that a subset of the lsm hooks API is 
immutable. That's a question for the LSM maintainers. 

Thanks to Alexei, Daniel B, and Daniel Mack, and Tejun for input. I would love 
to know what y'all think.


Sargun Dhillon (9):
  net: Make cgroup sk data present when calling security_sk_(alloc/free)
  cgroups: move helper cgroup_parent to cgroup.h
  bpf: move tracing helpers (probe_read, get_current_task) to shared
    helpers
  bpf, security: Add Checmate security LSM and BPF program type
  bpf: Add bpf_probe_write_checmate helper
  bpf: Share current_task_under_cgroup helper and expose to Checmate
    programs
  samples/bpf: Split out helper code from
    test_current_task_under_cgroup_user
  samples/bpf: Add limit_connections, remap_bind checmate examples /
    tests
  doc: Add LSM / BPF Checmate docs

 Documentation/security/Checmate.txt               |  54 ++
 include/linux/bpf.h                               |   3 +
 include/linux/cgroup.h                            |  16 +
 include/linux/cgroup_subsys.h                     |   4 +
 include/linux/checmate.h                          | 108 ++++
 include/uapi/linux/bpf.h                          |  12 +
 kernel/bpf/helpers.c                              |  63 +++
 kernel/bpf/syscall.c                              |   2 +-
 kernel/cgroup.c                                   |   9 -
 kernel/trace/bpf_trace.c                          |  61 ---
 net/core/sock.c                                   |   5 +-
 samples/bpf/Makefile                              |  12 +-
 samples/bpf/bpf_helpers.h                         |   2 +
 samples/bpf/bpf_load.c                            |  11 +-
 samples/bpf/cgroup_helpers.c                      | 103 ++++
 samples/bpf/cgroup_helpers.h                      |  15 +
 samples/bpf/checmate_limit_connections_kern.c     | 146 ++++++
 samples/bpf/checmate_limit_connections_user.c     | 113 ++++
 samples/bpf/checmate_remap_bind_kern.c            |  28 +
 samples/bpf/checmate_remap_bind_user.c            |  82 +++
 samples/bpf/test_current_task_under_cgroup_user.c |  72 +--
 security/Kconfig                                  |   1 +
 security/Makefile                                 |   2 +
 security/checmate/Kconfig                         |  11 +
 security/checmate/Makefile                        |   3 +
 security/checmate/checmate_bpf.c                  | 125 +++++
 security/checmate/checmate_lsm.c                  | 610 ++++++++++++++++++++++
 27 files changed, 1534 insertions(+), 139 deletions(-)
 create mode 100644 Documentation/security/Checmate.txt
 create mode 100644 include/linux/checmate.h
 create mode 100644 samples/bpf/cgroup_helpers.c
 create mode 100644 samples/bpf/cgroup_helpers.h
 create mode 100644 samples/bpf/checmate_limit_connections_kern.c
 create mode 100644 samples/bpf/checmate_limit_connections_user.c
 create mode 100644 samples/bpf/checmate_remap_bind_kern.c
 create mode 100644 samples/bpf/checmate_remap_bind_user.c
 create mode 100644 security/checmate/Kconfig
 create mode 100644 security/checmate/Makefile
 create mode 100644 security/checmate/checmate_bpf.c
 create mode 100644 security/checmate/checmate_lsm.c

-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux