Re: [PATCH v3 1/1] ACPICA: utdebug: use spinlocks to fix the data-races reported by the KCSAN

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 9/20/23 19:03, Rafael J. Wysocki wrote:
On Wed, Sep 20, 2023 at 6:30 PM Mirsad Goran Todorovac
<mirsad.todorovac@xxxxxxxxxxxx> wrote:

KCSAN reported hundreds of instances of data-races in ACPICA like this one:

[    6.994149] ==================================================================
[    6.994443] BUG: KCSAN: data-race in acpi_ut_status_exit / acpi_ut_trace

[    6.994795] write to 0xffffffffbae5a884 of 4 bytes by task 0 on cpu 2:
[    6.994944] acpi_ut_status_exit (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/utdebug.c:467)
[    6.994957] acpi_hw_register_read (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwregs.c:563)
[    6.994968] acpi_read_bit_register (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwxface.c:171)
[    6.994980] acpi_idle_bm_check (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:511)
[    6.994990] acpi_idle_enter_bm (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:644 (discriminator 1))
[    6.995000] acpi_idle_enter (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:695)
[    6.995010] cpuidle_enter_state (/home/marvin/linux/kernel/torvalds2/drivers/cpuidle/cpuidle.c:267)
[    6.995019] cpuidle_enter (/home/marvin/linux/kernel/torvalds2/drivers/cpuidle/cpuidle.c:390)
[    6.995027] call_cpuidle (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:135)
[    6.995038] do_idle (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:219 /home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:282)
[    6.995046] cpu_startup_entry (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:378 (discriminator 1))
[    6.995055] start_secondary (/home/marvin/linux/kernel/torvalds2/arch/x86/kernel/smpboot.c:210 /home/marvin/linux/kernel/torvalds2/arch/x86/kernel/smpboot.c:294)
[    6.995066] secondary_startup_64_no_verify (/home/marvin/linux/kernel/torvalds2/arch/x86/kernel/head_64.S:433)

[    6.995121] read to 0xffffffffbae5a884 of 4 bytes by task 0 on cpu 9:
[    6.995267] acpi_ut_trace (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/utdebug.c:263)
[    6.995279] acpi_hw_validate_io_request (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwvalid.c:101)
[    6.995291] acpi_hw_read_port (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwvalid.c:202)
[    6.995303] acpi_hw_read (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwregs.c:251)
[    6.995313] acpi_hw_register_read (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwregs.c:725 /home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwregs.c:499)
[    6.995325] acpi_read_bit_register (/home/marvin/linux/kernel/torvalds2/drivers/acpi/acpica/hwxface.c:171)
[    6.995336] acpi_idle_bm_check (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:511)
[    6.995346] acpi_idle_enter_bm (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:644 (discriminator 1))
[    6.995356] acpi_idle_enter (/home/marvin/linux/kernel/torvalds2/drivers/acpi/processor_idle.c:695)
[    6.995366] cpuidle_enter_state (/home/marvin/linux/kernel/torvalds2/drivers/cpuidle/cpuidle.c:267)
[    6.995375] cpuidle_enter (/home/marvin/linux/kernel/torvalds2/drivers/cpuidle/cpuidle.c:390)
[    6.995383] call_cpuidle (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:135)
[    6.995394] do_idle (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:219 /home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:282)
[    6.995402] cpu_startup_entry (/home/marvin/linux/kernel/torvalds2/kernel/sched/idle.c:378 (discriminator 1))
[    6.995411] start_secondary (/home/marvin/linux/kernel/torvalds2/arch/x86/kernel/smpboot.c:210 /home/marvin/linux/kernel/torvalds2/arch/x86/kernel/smpboot.c:294)
[    6.995422] secondary_startup_64_no_verify (/home/marvin/linux/kernel/torvalds2/arch/x86/kernel/head_64.S:433)

[    6.995476] value changed: 0x00000004 -> 0x00000002

[    6.995629] Reported by Kernel Concurrency Sanitizer on:
[    6.995748] CPU: 9 PID: 0 Comm: swapper/9 Not tainted 6.6.0-rc2-kcsan-00003-g16819584c239-dirty #21
[    6.995758] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
[    6.995765] ==================================================================

Please find the complete list at: https://domac.alu.unizg.hr/~mtodorov/linux/patches/acpica_utdebug/acpi_ut_status_exit.log.xz

A number of unprotected increments:

         acpi_gbl_nesting_level++;

and conditional statements:

         if (acpi_gbl_nesting_level) {
                 acpi_gbl_nesting_level--;
         }

no longer work in SMP environment.

Proper locking like

         spin_lock(&acpi_utdebug_lock);
         acpi_gbl_nesting_level++;
         spin_unlock(&acpi_utdebug_lock);

and

         spin_lock(&acpi_utdebug_lock);
         if (acpi_gbl_nesting_level) {
                 acpi_gbl_nesting_level--;
         }
         spin_unlock(&acpi_utdebug_lock);

makes these data-races go away.

Additionally, READ_ONCE() or WRITE_ONCE() is required with the global variable
acpi_gbl_nesting_level to prevent unwanted read or write reordering or other funny
stuff the optmisers do.

The patch eliminates KCSAN BUG warnings.

Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@xxxxxxxxxxxx>
Fixes: 6be2d72b18649 ("ACPICA: Update for a few debug output statements")
Fixes: bf9b448ef8430 ("ACPICA: Debug output: Do not emit function nesting level for kernel build.")
Fixes: 6e875fa0480c1 ("ACPICA: Debugger: fix slight indentation issue")
Fixes: ^1da177e4c3f4 ("Initial git repository build.")
Cc: Jung-uk Kim <jkim@xxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Erik Kaneda <erik.kaneda@xxxxxxxxx>
Cc: Bob Moore <robert.moore@xxxxxxxxx>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@xxxxxxxxx>
Cc: Len Brown <lenb@xxxxxxxxxx>
Cc: linux-acpi@xxxxxxxxxxxxxxx
Cc: acpica-devel@xxxxxxxxxxxxxxxxxxxxxxxxx
Reported-by: kernel test robot <lkp@xxxxxxxxx>
Closes: https://lore.kernel.org/oe-kbuild-all/202309201331.S2c1JL2h-lkp@xxxxxxxxx/
Closes: https://lore.kernel.org/oe-kbuild-all/202309201308.5ZBJFbjh-lkp@xxxxxxxxx/
Link: https://github.com/acpica/acpica/pull/893
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@xxxxxxxxxxxx>
---
v2 -> v3:
  Made the pull request at the ACPICA github.
  Fixed DEFINE_SPINLOCK() undefined warning by putting it inside #ifdef ACPI_DEBUG_OUTPUT
  ... #endif
  Capitalised ACPICA according to the convention.

v1 -> v2:
  Moved the declaration of 'u32 nesting_level' inside #ifdef ACPI_APPLICATION ... #endif
  according to the unused variable warning of the kernel test robot.

v1:
  Preliminary RFC version of the patch.

  drivers/acpi/acpica/utdebug.c | 49 +++++++++++++++++++++++++----------
  1 file changed, 35 insertions(+), 14 deletions(-)

diff --git a/drivers/acpi/acpica/utdebug.c b/drivers/acpi/acpica/utdebug.c
index c5f6c85a3a09..594eee294f0f 100644
--- a/drivers/acpi/acpica/utdebug.c
+++ b/drivers/acpi/acpica/utdebug.c
@@ -15,8 +15,9 @@

  #define _COMPONENT          ACPI_UTILITIES
  ACPI_MODULE_NAME("utdebug")
-
  #ifdef ACPI_DEBUG_OUTPUT
+static DEFINE_SPINLOCK(acpi_utdebug_lock);

No, you can't do that in the ACPICA code.  Likewise for other
Linux-specific pieces of code in the patch below.

I'd very much prefer you to send a proper problem report instead of
sending patches like this.

I was so enthusiastic about the patch that cleared all the KCSAN data-race bug warnings,
that I must have forgot about sending one.

Patching of hundreds of annoying KCSAN acpi warnings seemed so good that I forgot about the
procedure, sorry.

I am aware that the maintainers have to carefully evaluate any contribution in order for the
kernel not becoming bloated and unstable ...

As it stands, I'm not even sure what problem exactly this is
attempting to address.

I hope the attached excerpt from the dmesg log will help.

Thanks!

Anytime.

Best regards,
Mirsad Todorovac

Attachment: acpi-kcsan-20230921-01.log.xz
Description: application/xz

Attachment: config-6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0.xz
Description: application/xz


[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux