On 10/24/2018 09:51 PM, Kees Cook wrote:
On Wed, Oct 24, 2018 at 8:35 AM, Michael Sammler <msammler@xxxxxxxxxxx> wrote:
Add the current value of the PKRU register to data available for
seccomp-bpf programs to work on. This allows filters based on the
currently enabled protection keys.
As mentioned in the thread, this commit log needs to be expanded greatly. :)
No problem, I will expand it with the envisioned usages of this patch.
Is there something else specifically, which should be added to the
commit log?
Cc: Kees Cook <keescook@xxxxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
Cc: Will Drewry <wad@xxxxxxxxxxxx>
Cc: linux-api@xxxxxxxxxxxxxxx
Signed-off-by: Michael Sammler <msammler@xxxxxxxxxxx>
---
arch/mips/kernel/ptrace.c | 2 ++
arch/x86/entry/common.c | 2 ++
include/uapi/linux/seccomp.h | 4 ++++
kernel/seccomp.c | 2 ++
tools/testing/selftests/seccomp/seccomp_bpf.c | 4 +++-
5 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c
index e5ba56c0..07d55955 100644
--- a/arch/mips/kernel/ptrace.c
+++ b/arch/mips/kernel/ptrace.c
@@ -1277,6 +1277,8 @@ asmlinkage long syscall_trace_enter(struct pt_regs *regs, long syscall)
for (i = 0; i < 6; i++)
sd.args[i] = args[i];
sd.instruction_pointer = KSTK_EIP(current);
+ sd.pkru = 0;
+ sd.reserved = 0;
ret = __secure_computing(&sd);
if (ret == -1)
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 3b2490b8..2afa85d7 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -98,6 +98,8 @@ static long syscall_trace_enter(struct pt_regs *regs)
sd.arch = arch;
sd.nr = regs->orig_ax;
sd.instruction_pointer = regs->ip;
+ sd.pkru = read_pkru();
+ sd.reserved = 0;
#ifdef CONFIG_X86_64
if (arch == AUDIT_ARCH_X86_64) {
sd.args[0] = regs->di;
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 9efc0e73..e8b9ecfc 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -52,12 +52,16 @@
* @instruction_pointer: at the time of the system call.
* @args: up to 6 system call arguments always stored as 64-bit values
* regardless of the architecture.
+ * @pkru: value of the pkru register
+ * @reserved: pad the structure to a multiple of eight bytes
*/
struct seccomp_data {
int nr;
__u32 arch;
__u64 instruction_pointer;
__u64 args[6];
+ __u32 pkru;
+ __u32 reserved;
};
struct seccomp_data only needs to be 32-bit aligned. And since I think
old kernels will correctly reject filters that use the new, larger,
seccomp_data (via the sizeof() checks in seccomp_check_filter(), it
would be better to leave off the "reserved" field, since it has no
meaning right now, and it would be best to "version" the seccomp_data
entirely by size.
#endif /* _UAPI_LINUX_SECCOMP_H */
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index fd023ac2..b193c26e 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -91,6 +91,8 @@ static void populate_seccomp_data(struct seccomp_data *sd)
sd->args[4] = args[4];
sd->args[5] = args[5];
sd->instruction_pointer = KSTK_EIP(task);
+ sd->pkru = read_pkru();
In some recent MPK usage[1] it sounded like the MPK instructions were
extremely expensive to run. I'm strongly against adding any global
performance regressions to seccomp, so if this field is slow to
populate, I think we'll need some kind of flag to have a filter opt in
to using it (and have it set to some impossible value otherwise).
[1] https://github.com/AndroidHardening/hardened_malloc/commit/e985afe0e16f4e1f0d178476342de10bce5d2c0c
As I understand it, the code you mention talks about setting a
protection key being slow[1], not reading the PKRU. Reading the PKRU
should be fast as it is an register in the processor, but I have not
conducted extensive benchmarks yet.
[1] https://github.com/AndroidHardening/hardened_malloc/issues/62
An alternative idea to this proposal would be to extend BPF with a new
instruction, which reads a specified protection key. Then only BPF
programs, which use this instruction, would need to read the PKRU.
+ sd->reserved = 0;
}
/**
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index e1473234..359249cd 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -732,7 +732,9 @@ TEST(KILL_process)
TEST(arg_out_of_range)
{
struct sock_filter filter[] = {
- BPF_STMT(BPF_LD|BPF_W|BPF_ABS, syscall_arg(6)),
+ BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
+ offsetof(struct seccomp_data, reserved)
This should probably just be sizeof(struct seccomp_data).
This would work as well, but then the test would not fail, if a change
in the seccomp_data structure introduced padding.
+ + sizeof(__u32)),
BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
};
struct sock_fprog prog = {
--
2.11.0
-Kees
- Michael