On Thu, May 5, 2022 at 11:56 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > Alexander, > > On Thu, May 05 2022 at 20:04, Alexander Potapenko wrote: > > On Tue, May 3, 2022 at 12:00 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > >> > Regarding the regs, you are right. It should be enough to unpoison the > >> > regs at idtentry prologue instead. > >> > I tried that initially, but IIRC it required patching each of the > >> > DEFINE_IDTENTRY_XXX macros, which already use instrumentation_begin(). > >> > >> Exactly 4 instances :) > >> > > > > Not really, I had to add a call to `kmsan_unpoison_memory(regs, > > sizeof(*regs));` to the following places in > > arch/x86/include/asm/idtentry.h: > > - DEFINE_IDTENTRY() > > - DEFINE_IDTENTRY_ERRORCODE() > > - DEFINE_IDTENTRY_RAW() > > - DEFINE_IDTENTRY_RAW_ERRORCODE() > > - DEFINE_IDTENTRY_IRQ() > > - DEFINE_IDTENTRY_SYSVEC() > > - DEFINE_IDTENTRY_SYSVEC_SIMPLE() > > - DEFINE_IDTENTRY_DF() > > > > , but even that wasn't enough. For some reason I also had to unpoison > > pt_regs directly in > > DEFINE_IDTENTRY_SYSVEC(sysvec_apic_timer_interrupt) and > > DEFINE_IDTENTRY_IRQ(common_interrupt). > > In the latter case, this could have been caused by > > asm_common_interrupt being entered from irq_entries_start(), but I am > > not sure what is so special about sysvec_apic_timer_interrupt(). > > > > Ideally, it would be great to find that single point where pt_regs are > > set up before being passed to all IDT entries. > > I used to do that by inserting calls to kmsan_unpoison_memory right > > into arch/x86/entry/entry_64.S > > (https://github.com/google/kmsan/commit/3b0583f45f74f3a09f4c7e0e0588169cef918026), > > but that required storing/restoring all GP registers. Maybe there's a > > better way? > > Yes. Something like this should cover all exceptions and syscalls before > anything instrumentable can touch @regs. Anything up to those points is > either off-limit for instrumentation or does not deal with @regs. > > --- a/kernel/entry/common.c > +++ b/kernel/entry/common.c > @@ -24,6 +24,7 @@ static __always_inline void __enter_from > user_exit_irqoff(); > > instrumentation_begin(); > + unpoison(regs); > trace_hardirqs_off_finish(); > instrumentation_end(); > } > @@ -352,6 +353,7 @@ noinstr irqentry_state_t irqentry_enter( > lockdep_hardirqs_off(CALLER_ADDR0); > rcu_irq_enter(); > instrumentation_begin(); > + unpoison(regs); > trace_hardirqs_off_finish(); > instrumentation_end(); > > @@ -367,6 +369,7 @@ noinstr irqentry_state_t irqentry_enter( > */ > lockdep_hardirqs_off(CALLER_ADDR0); > instrumentation_begin(); > + unpoison(regs); > rcu_irq_enter_check_tick(); > trace_hardirqs_off_finish(); > instrumentation_end(); > @@ -452,6 +455,7 @@ irqentry_state_t noinstr irqentry_nmi_en > rcu_nmi_enter(); > > instrumentation_begin(); > + unpoison(regs); > trace_hardirqs_off_finish(); > ftrace_nmi_enter(); > instrumentation_end(); > > As I said: 4 places :) These four instances still do not look sufficient. Right now I am seeing e.g. reports with the following stack trace: BUG: KMSAN: uninit-value in irqtime_account_process_tick+0x255/0x580 kernel/sched/cputime.c:382 irqtime_account_process_tick+0x255/0x580 kernel/sched/cputime.c:382 account_process_tick+0x98/0x450 kernel/sched/cputime.c:476 update_process_times+0xe4/0x3e0 kernel/time/timer.c:1786 tick_sched_handle kernel/time/tick-sched.c:243 tick_sched_timer+0x83e/0x9e0 kernel/time/tick-sched.c:1473 __run_hrtimer+0x518/0xe50 kernel/time/hrtimer.c:1685 __hrtimer_run_queues kernel/time/hrtimer.c:1749 hrtimer_interrupt+0x838/0x15a0 kernel/time/hrtimer.c:1811 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 __sysvec_apic_timer_interrupt+0x1ae/0x680 arch/x86/kernel/apic/apic.c:1103 sysvec_apic_timer_interrupt+0x95/0xc0 arch/x86/kernel/apic/apic.c:1097 ... (uninit creation stack trace is irrelevant here, because it is some random value from the stack) sysvec_apic_timer_interrupt() receives struct pt_regs from uninstrumented code, so regs can be partially uninitialized. They are not passed down the call stack directly, but are instead saved by set_irq_regs() in sysvec_apic_timer_interrupt() and loaded by get_irq_regs() in tick_sched_timer(). The remaining false positives can be fixed by unpoisoning the registers in set_irq_regs(): static inline struct pt_regs *set_irq_regs(struct pt_regs *new_regs) { struct pt_regs *old_regs; + kmsan_unpoison_memory(new_regs, sizeof(*new_regs)); old_regs = __this_cpu_read(__irq_regs); __this_cpu_write(__irq_regs, new_regs); Does that sound viable? Is it correct to assume that set_irq_regs() is always called for registers received from non-instrumented code? (It seems that just unpoisoning registers in set_irq_regs() is not enough, i.e. we still need to do what you suggest in kernel/entry/common.c) -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Liana Sebastian Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde. This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.