Commit b041b525dab9 ("x86/split_lock: Make life miserable for split lockers") changed the way the split lock detector works when in "warn" mode; basically, not only it shows the warn message, but also intentionally introduces a slowdown (through sleeping plus serialization mechanism) on such task. Based on discussions in [0], seems the warning alone wasn't enough motivation for userspace developers to fix their applications. Happens that originally the proposal in [0] was to add a new mode which would warns + slowdown the "split locking" task, keeping the old warn mode untouched. In the end, that idea was discarded and the regular/default "warn" mode now slowdowns the applications. This is quite aggressive with regards proprietary/legacy programs that basically are unable to properly run in kernel with this change. While is understandable that a malicious application could try a DoS by split locking, it seems unacceptable to regress old/proprietary userspace programs through a default configuration that previously worked. An example of such breakage was reported in [1]. So let's add a sysctl to allow controlling the "misery mode" behavior, as per Thomas suggestion on [2]. This way, users running legacy and/or proprietary software are allowed to still execute them with a decent performance while still observe the warning messages on kernel log. [0] https://lore.kernel.org/lkml/20220217012721.9694-1-tony.luck@xxxxxxxxx/ [1] https://github.com/doitsujin/dxvk/issues/2938 [2] https://lore.kernel.org/lkml/87pmf4bter.ffs@tglx/ Fixes: b041b525dab9 ("x86/split_lock: Make life miserable for split lockers") Cc: Andre Almeida <andrealmeid@xxxxxxxxxx> Cc: Fenghua Yu <fenghua.yu@xxxxxxxxx> Cc: Joshua Ashton <joshua@xxxxxxxxx> Cc: Melissa Wen <mwen@xxxxxxxxxx> Cc: Paul Gofman <pgofman@xxxxxxxxxxxxxxx> Cc: Pavel Machek <pavel@xxxxxxx> Cc: Pierre-Loup Griffais <pgriffais@xxxxxxxxxxxxxxxxx> Cc: Tony Luck <tony.luck@xxxxxxxxx> Cc: Zebediah Figura <zfigura@xxxxxxxxxxxxxxx> Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Signed-off-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx> --- V2: - Switched to sysctl approach following Thomas' suggestion (thanks!). Andre tested the patch and will comment in this thread - seems everything is working as expected and we can enable/disable that, affecting the misery mode as one expects. I've tried to keep the semaphore's up()/down() calls in-sync/paired, hence my approach of two delayed tasks, with and without misery. Reviews / comments are greatly appreciated. Thanks, Guilherme Documentation/admin-guide/sysctl/kernel.rst | 18 ++++++ arch/x86/kernel/cpu/intel.c | 61 +++++++++++++++++---- 2 files changed, 69 insertions(+), 10 deletions(-) diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst index ee6572b1edad..508952e42914 100644 --- a/Documentation/admin-guide/sysctl/kernel.rst +++ b/Documentation/admin-guide/sysctl/kernel.rst @@ -1298,6 +1298,24 @@ watchdog work to be queued by the watchdog timer function, otherwise the NMI watchdog — if enabled — can detect a hard lockup condition. +split_lock_mitigate (x86 only) +============= + +For x86 CPUs supporting the split lock detection mechanism, this parameter +allows the users to turn off what is called "the misery mode", which +introduces intentional delay in userspace applications that split locks. +The goal of the misery mode is to prevent using such unaligned access to +DoS the system dropping the performance overall, but some of these split +locking programs are legacy and/or proprietary software that cannot be fixed, +so using this sysctl is a way to allow them to run with a decent performance. + += =================================================================== +0 Disables the misery mode - just warns the split lock on kernel log. +1 Enables the misery mode (this is the default) - penalizes the split + lockers with intentional performance degradation. += =================================================================== + + stack_erasing ============= diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 2d7ea5480ec3..2aacf9d6c723 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -1034,8 +1034,32 @@ static const struct { static struct ratelimit_state bld_ratelimit; +static unsigned int sysctl_sld_mitigate = 1; static DEFINE_SEMAPHORE(buslock_sem); +#ifdef CONFIG_PROC_SYSCTL +static struct ctl_table sld_sysctls[] = { + { + .procname = "split_lock_mitigate", + .data = &sysctl_sld_mitigate, + .maxlen = sizeof(unsigned int), + .mode = 0644, + .proc_handler = proc_douintvec_minmax, + .extra1 = SYSCTL_ZERO, + .extra2 = SYSCTL_ONE, + }, + {} +}; + +static int __init sld_mitigate_sysctl_init(void) +{ + register_sysctl_init("kernel", sld_sysctls); + return 0; +} + +late_initcall(sld_mitigate_sysctl_init); +#endif + static inline bool match_option(const char *arg, int arglen, const char *opt) { int len = strlen(opt), ratelimit; @@ -1146,11 +1170,18 @@ static void split_lock_init(void) split_lock_verify_msr(sld_state != sld_off); } -static void __split_lock_reenable(struct work_struct *work) +static void __split_lock_reenable_sem(struct work_struct *work) { sld_update_msr(true); up(&buslock_sem); } +static DECLARE_DELAYED_WORK(split_lock_reenable_sem, __split_lock_reenable_sem); + +static void __split_lock_reenable(struct work_struct *work) +{ + sld_update_msr(true); +} +static DECLARE_DELAYED_WORK(split_lock_reenable, __split_lock_reenable); /* * If a CPU goes offline with pending delayed work to re-enable split lock @@ -1169,10 +1200,9 @@ static int splitlock_cpu_offline(unsigned int cpu) return 0; } -static DECLARE_DELAYED_WORK(split_lock_reenable, __split_lock_reenable); - static void split_lock_warn(unsigned long ip) { + struct delayed_work *wk; int cpu; if (!current->reported_split_lock) @@ -1180,14 +1210,25 @@ static void split_lock_warn(unsigned long ip) current->comm, current->pid, ip); current->reported_split_lock = 1; - /* misery factor #1, sleep 10ms before trying to execute split lock */ - if (msleep_interruptible(10) > 0) - return; - /* Misery factor #2, only allow one buslocked disabled core at a time */ - if (down_interruptible(&buslock_sem) == -EINTR) - return; + if (sysctl_sld_mitigate) { + /* + * misery factor #1: + * sleep 10ms before trying to execute split lock. + */ + if (msleep_interruptible(10) > 0) + return; + /* + * Misery factor #2: + * only allow one buslocked disabled core at a time. + */ + wk = &split_lock_reenable_sem; + if (down_interruptible(&buslock_sem) == -EINTR) + return; + } else + wk = &split_lock_reenable; + cpu = get_cpu(); - schedule_delayed_work_on(cpu, &split_lock_reenable, 2); + schedule_delayed_work_on(cpu, wk, 2); /* Disable split lock detection on this CPU to make progress */ sld_update_msr(false); -- 2.38.0