> On Dec 8, 2018, at 2:52 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Fri, Dec 07, 2018 at 04:40:52PM -0800, Nadav Amit wrote: > >>> I'm actually having difficulty finding the this_cpu_read() in any of the >>> functions you mention, so I cannot make any concrete suggestions other >>> than pointing at the alternative functions available. >> >> >> So I got deeper into the code to understand a couple of differences. In the >> case of select_idle_sibling(), the patch (Peter’s) increase the function >> code size by 123 bytes (over the baseline of 986). The per-cpu variable is >> called through the following call chain: >> >> select_idle_sibling() >> => select_idle_cpu() >> => local_clock() >> => raw_smp_processor_id() >> >> And results in 2 more calls to sched_clock_cpu(), as the compiler assumes >> the processor id changes in between (which obviously wouldn’t happen). > > That is the thing with raw_smp_processor_id(), it is allowed to be used > in preemptible context, and there it _obviously_ can change between > subsequent invocations. > > So again, this change is actually good. > > If we want to fix select_idle_cpu(), we should maybe not use > local_clock() there but use sched_clock_cpu() with a stable argument, > this code runs with IRQs disabled and therefore the CPU number is stable > for us here. > >> There may be more changes around, which I didn’t fully analyze. But >> the very least reading the processor id should not get “volatile”. >> >> As for finish_task_switch(), the impact is only few bytes, but still >> unnecessary. It appears that with your patch preempt_count() causes multiple >> reads of __preempt_count in this code: >> >> if (WARN_ONCE(preempt_count() != 2*PREEMPT_DISABLE_OFFSET, >> "corrupted preempt_count: %s/%d/0x%x\n", >> current->comm, current->pid, preempt_count())) >> preempt_count_set(FORK_PREEMPT_COUNT); > > My patch proposed here: > > https://marc.info/?l=linux-mm&m=154409548410209 > > would actually fix that one I think, preempt_count() uses > raw_cpu_read_4() which will loose the volatile with that patch. Sorry for the spam from yesterday. That what happens when I try to write emails on my phone while I’m distracted. I tested the patch you referenced, and it certainly improves the situation for reads, but there are still small and big issues lying around. The biggest one is that (I think) smp_processor_id() should apparently use __this_cpu_read(). Anyhow, when preemption checks are on, it is validated that smp_processor_id() is not used while preemption is enabled, and IRQs are not supposed to change its value. Otherwise the generated code of many functions is affected. There are all kind of other smaller issues, such as set_irq_regs() and get_irq_regs(), which should run with disabled interrupts. They affect the generated code in do_IRQ() and others. But beyond that, there are so many places in the code that use this_cpu_read() while IRQs are guaranteed to be disabled. For example arch/x86/mm/tlb.c is full with this_cpu_read/write() and almost(?) all should be running with interrupts disabled. Having said that, in my build only flush_tlb_func_common() was affected.