[PATCH V0 3/3] x86, bm: Add documentation on Intel Branch Monitoring

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This patch adds the Documentation/x86/intel_bm.txt file with some
information about Intel Branch monitoring.

Signed-off-by: Megha Dey <megha.dey@xxxxxxxxxxxxxxx>
Signed-off-by: Yu-Cheng Yu <yu-cheng.yu@xxxxxxxxx>
---
 Documentation/x86/intel_bm.txt | 216 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 216 insertions(+)
 create mode 100644 Documentation/x86/intel_bm.txt

diff --git a/Documentation/x86/intel_bm.txt b/Documentation/x86/intel_bm.txt
new file mode 100644
index 0000000..25b7177
--- /dev/null
+++ b/Documentation/x86/intel_bm.txt
@@ -0,0 +1,216 @@
+Intel(R) Branch Monitoring
+
+Copyright (C) 2017 Intel Corporation
+
+Megha Dey <megha.dey@xxxxxxxxx>
+Yu-Cheng Yu <yu-cheng.yu@xxxxxxxxx>
+
+I. Overview
+===========
+
+The Cannonlake family of Intel processors support the branch monitoring
+feature. This feature uses heuristics to detect the occurrence of an ROP
+(Return Oriented Programming) or ROP like(JOP:Jump oriented programming)
+attack. These heuristics are based off certain performance monitoring
+statistics, measured dynamically over a short configurable window period.
+ROP is a malware trend in which the attacker can compromise a return
+pointer held on the stack to redirect execution to a different desired
+instruction.
+
+Support for branch monitoring has been added via Linux kernel perf event
+infrastructure. This feature is enabled by CONFIG_PERF_EVENTS_INTEL_BM.
+
+Once the kernel is compiled with CONFIG_PERF_EVENTS_INTEL_BM=y on a
+Cannonlake system, the following perf events are added which can be viewed
+with perf list:
+  intel_bm/branch-misp/                              [Kernel PMU event]
+  intel_bm/call-ret/                                 [Kernel PMU event]
+  intel_bm/far-branch/                               [Kernel PMU event]
+  intel_bm/indirect-branch-misp/                     [Kernel PMU event]
+  intel_bm/ret-misp/                                 [Kernel PMU event]
+  intel_bm/rets/                                     [Kernel PMU event]
+
+II. Hardware details
+====================
+
+The MSRs associated with branch monitoring are as follows:
+
+1. BR_DETECT_CTRL : Branch Monitoring Global control
+   Used for enabling and configuring global capability
+
+2. BR_DETECT_STATUS : Branch Monitoring Global Status
+   Used by SW handler for determining detect status
+
+3. BR_DETECT_COUNTER_CONFIG_i : Branch Monitoring Counter Configuration
+   Per-cpu branch monitoring counter Configuration
+
+There are 2 8-bit counters that each can select between one of the
+following 6 events:
+
+1. RET instructions: Counts the number of near return instructions retired
+
+2. CALL-RET instructions: Counts the difference between the number of near
+   return and call instructions retired
+
+3. RET mispredicts: Mispredicted return instructions retired
+
+4. Branch (all) mispredicts: Counts the number of mispredicted branches
+
+5. Indirect branch mispredicts: Counts the number of mispredicted indirect
+   near branch instructions. Includes indirect near jump/call instructions
+
+6. Far branch instructions: Counts the number of far branches retired
+
+Branch Monitoring hardware utilizes various existing performance related
+counter events. Of the 6 events above, only call-ret is newly implemented.
+
+The events are evaluated over a specified 10-bit instruction window size
+(0 to 1023). For each counter, a threshold value (0 to 127) can be
+configured to set a point at which an interrupt is generated and a
+detection event action is taken (determined by user-space). This can take
+the form of signaling an interrupt and/or freezing the state of the last
+branch record information.
+
+The event counters are reset after every 'window size' instructions by the
+hardware.
+
+The feature is for user mode (privilege level > 0) operation only, which is
+the known malware security threat target environment. While in supervisor
+mode, this heuristic detection counter activity is suspended. This behavior
+(user mode) is independent of root vs. non-root with respect to
+virtualization technology execution.
+
+III. Software Implementation
+============================
+
+A perf-based kernel driver has been used to monitor the occurrence of
+one of the 6 branch monitoring events.
+
+If an branch monitoring interrupt is generated, the interrupt bit is set
+which is cleared by interrupt handler and the event counters are reset.
+
+The entire system can monitor a maximum of 2 events at any given time.
+These events can belong to the same or different tasks.
+
+Everytime a task is scheduled out, we save current window and count
+associated with the event being monitored. When the task is scheduled next,
+we start counting from previous count associated with this event. Thus, a
+full context switch in this case is not necessary.
+
+The Branch Monitoring exception can be configured as a regular interrupt or
+an NMI. We chain an NMI handler after PMU, because
+1. It will not interfere with PMU events
+2. We only monitor for user-mode events, and this will not delay branch
+   monitoring events for user-mode
+
+We monitor only per-task events. It does not make sense to monitor all tasks
+for an attack. This could generate a lot of false positives.
+
+IV. User-configurable inputs
+============================
+
+Several sysfs entries are provided in /sys/devices/intel_bm/ to configure
+controls for the supported hardware heuristics.
+
+1. LBR freeze: /sys/devices/intel-bm/lbr_freeze
+   possible values are 0 or 1. By default this is disabled(0). When enabled,
+   an LBR freeze is observed on threshold trip
+
+2. Guest Disable: /sys/devices/intel-bm/guest_disable
+   Possible values are 0 or 1. By default it is 0. When set to ‘1’, branch
+   monitoring feature is disabled when operating at VMX non-root operation.
+
+3. Window size: /sys/devices/intel-bm/window_size
+   By default, window size is 1023. It can take values from 0 to 1023. This
+   represents the number of instructions to be executed before the event
+   counters are reset.
+
+4. Window count select: /sys/devices/intel-bm/window_cnt_sel
+   Possible values are:
+   ‘00 = instructions retired
+   ‘01 = branches retired
+   ‘10 = returned instructions retired
+   ‘11 = indirect branch instructions retired
+   By default, it has a value of 0.
+
+5. Count and mode: /sys/devices/intel-bm/cnt_and_mode
+   Possible values are 0 or 1. By default it is 0. When set to ‘1’, the
+   overall event triggering condition is true only if both enabled
+   counter’s threshold conditions are true. When ‘0’, the threshold
+   tripping condition is true if either enabled counter’s threshold is
+   true. If a counter is not enabled, then it does not factor into the
+   AND’ing logic
+
+6. Threshold: /sys/devices/intel-bm/threshold
+   An unsigned value of 0 to 127 is supported. The value 0 of counter
+   threshold will result in branch monitoring event signaled after every
+   instruction. By default, it has a value of 127.
+
+7. Mispredict counting behaviour: /sys/devices/intel-bm/mispred_evt_cnt
+   Possible values are:
+   0 = mispredict events are counted in a window
+   1 = mispredict events are counted based on a consecutive occurrence.
+   By default, it has a value of 0.
+
+Threshold and Mispredict events counting behaviour are per-counter
+configurations whereas the rest are global.
+
+V. Example usage
+================
+
+1. To monitor a user space application for branch monitoring events, perf
+command line can be used as follows:
+
+perf stat -e intel_bm/rets/ ./test
+
+ Performance counter stats for './test':
+
+                 1      intel_bm/rets/
+
+       0.104705937 seconds time elapsed
+
+where test.c is:
+
+void func(void)
+{
+        return;
+}
+
+void main(void)
+{
+        int i;
+
+        for (i = 0; i < 128; i++) {
+                func();
+        }
+
+        return;
+}
+
+and threshold = 100 (echo 100 > /sys/devices/intel_bm/threshold)
+
+perf returns the number of branch monitoring interrupts occurred when the
+user-space application was running.
+
+2. To monitor 2 events for a task,
+
+perf stat -e intel_bm/far-branch/,intel_bm/rets/ ./rets-128.bin
+
+ Performance counter stats for './rets-128.bin':
+
+                 0      intel_bm/far-branch/
+                 1      intel_bm/rets/
+
+       0.104057608 seconds time elapsed
+
+For the above example, the threshold and window size are shared.
+
+3. To monitor 2 events with different thresholds(same or different task)
+
+On terminal 1:
+echo <threshold1> > /sys/devices/intel_bm/threshold
+perf stat -e intel_bm/rets/ ./test.bin
+
+On terminal 2:
+echo <threshold2> > /sys/devices/intel_bm/threshold
+perf stat -e intel_bm/call-ret/ ./test.bin
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux