Hardware assisted tracing families such as ARM Coresight, Intel PT provides rich tracing capabilities including instruction level tracing and accurate timestamps which are very useful for profiling and also pose a significant security risk. One such example of security risk is when kernel mode tracing is not excluded and these hardware assisted tracing can be used to analyze cryptographic code execution. In this case, even the root user must not be able to infer anything. To explain it more clearly in the words of a security team member (credits: Mattias Nissler), "Consider a system where disk contents are encrypted and the encryption key is set up by the user when mounting the file system. From that point on the encryption key resides in the kernel. It seems reasonable to expect that the disk encryption key be protected from exfiltration even if the system later suffers a root compromise (or even against insiders that have root access), at least as long as the attacker doesn't manage to compromise the kernel." Here the idea is to protect such important information from all users including root users since root privileges does not have to mean full control over the kernel [1] and root compromise does not have to be the end of the world. But "Peter said even the regular counters can be used for full branch trace, the information isn't as accurate as PT and friends and not easier but is good enough to infer plenty". This would mean that a global tunable config for all kernel mode pmu tracing is more appropriate than the one targeting the hardware assisted instruction tracing. Currently we can exclude kernel mode tracing via perf_event_paranoid sysctl but it has following limitations, * No option to restrict kernel mode instruction tracing by the root user. * Not possible to restrict kernel mode instruction tracing when the hardware assisted tracing IPs like ARM Coresight ETMs use an additional interface via sysfs for tracing in addition to perf interface. So introduce a new config CONFIG_EXCLUDE_KERNEL_PMU_TRACE to exclude kernel mode pmu tracing which will be generic and applicable to all hardware tracing families and which can also be used with other interfaces like sysfs in case of ETMs. [1] https://lwn.net/Articles/796866/ Suggested-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx> Suggested-by: Al Grant <al.grant@xxxxxxx> Tested-by: Denis Nikitin <denik@xxxxxxxxxxxx> Link: https://lore.kernel.org/lkml/20201015124522.1876-1-saiprakash.ranjan@xxxxxxxxxxxxxx/ Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@xxxxxxxxxxxxxx> --- init/Kconfig | 11 +++++++++++ kernel/events/core.c | 3 +++ 2 files changed, 14 insertions(+) diff --git a/init/Kconfig b/init/Kconfig index 22946fe5ded9..34d9b7587d2e 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1848,6 +1848,17 @@ config DEBUG_PERF_USE_VMALLOC endmenu +config EXCLUDE_KERNEL_PMU_TRACE + bool "Exclude Kernel mode PMU tracing" + depends on PERF_EVENTS + help + Exclude Kernel mode PMU tracing for all users. + + This option allows to disable kernel mode tracing for all + users(including root) which is especially useful in production + systems where only userspace tracing might be preferred for + security reasons. + config VM_EVENT_COUNTERS default y bool "Enable VM event counters for /proc/vmstat" if EXPERT diff --git a/kernel/events/core.c b/kernel/events/core.c index 0aeca5f3c0ac..241cc9640483 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -11770,6 +11770,9 @@ SYSCALL_DEFINE5(perf_event_open, if (err) return err; + if (IS_ENABLED(CONFIG_EXCLUDE_KERNEL_PMU_TRACE) && !attr.exclude_kernel) + return -EACCES; + if (!attr.exclude_kernel) { err = perf_allow_kernel(&attr); if (err) -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation