On Wed, Sep 30 2020 at 20:35, Peter Zijlstra wrote: > On Wed, Sep 30, 2020 at 08:00:59PM +0200, Thomas Gleixner wrote: >> On Wed, Sep 30 2020 at 19:03, Peter Zijlstra wrote: >> > On Wed, Sep 30, 2020 at 05:40:08PM +0200, Thomas Gleixner wrote: >> > Also, that preempt_disable() in there doesn't actually do anything. >> > Worse, preempt_disable(); for_each_cpu(); is an anti-pattern. It mixes >> > static_cpu_has() and boot_cpu_has() in the same bloody condition and has >> > a pointless ret variable. > > Also, I forgot to add, it accesses ->cpus_mask without the proper > locking, so it could be reading intermediate state from whatever cpumask > operation that's in progress. Yes. I saw that after hitting send. :( >> I absolutely agree and I really missed it when looking at it before >> merging. cpus_read_lock()/unlock() is the right thing to do if at all. >> >> > It's shoddy code, that only works if you align the planets right. We >> > really shouldn't provide interfaces that are this bad. >> > >> > It's correct operation is only by accident. >> >> True :( >> >> I understand Balbirs problem and it makes some sense to provide a >> solution. We can: >> >> 1) reject set_affinity() if the task has that flush muck enabled >> and user space tries to move it to a SMT enabled core >> >> 2) disable the muck if if detects that it is runs on a SMT enabled >> core suddenly (hotplug says hello) >> >> This one is nasty because there is no feedback to user space >> about the wreckage. > > That's and, right, not or. because 1) deals with sched_setffinity() > and 2) deals wit hotplug. It was meant as AND of course. > Now 1) requires an arch hook in sched_setaffinity(), something I'm not > keen on providing, once we provide it, who knows what strange and > wonderful things archs will dream up. I don't think so. We can have that magic in core: #ifdef CONFIG_HAS_PARANOID_L1D_FLUSH static bool paranoid_l1d_valid(struct task_struct *tsk, const struct cpumask *msk) { if (!test_tsk_thread_flag(tsk, TIF_SPEC_L1D_FLUSH)) return true; /* Do magic stuff */ return res; } #else static bool paranoid_l1d_valid(struct task_struct *tsk, const struct cpumask *msk) { return true; } #endif It's a pretty well defined problem and having the magic in core code prevents an arch hook which allows abuse of all sorts. And the same applies to enable_l1d_flush_for_task(). The only architecture specific nonsense are the checks whether the CPU bug is there and whether the hardware supports L1D flushing. So we can have: #ifdef CONFIG_HAS_PARANOID_L1D_FLUSH int paranoid_l1d_enable(struct task_struct *tsk) { /* Do the SMT validation under the proper locks */ if (!res) set_task_thread_flag(tsk, TIF_SPEC_L1D_FLUSH); return res; } #endif > And 2) also happens on hot-un-plug, when the task's affinity gets > forced because it became empty. No user feedback there either, and > information is lost. Of course. It's both that suddenly SMT gets enabled on a core which was isolated and when the last isolated core in the tasks CPU mask goes offline. > I suppose we can do 2) but send a signal. That would cover all cases and > keep it in arch code. But yes, that's pretty terrible too. Bah. I just looked at the condition to flush: if (sched_smt_active() && !this_cpu_read(cpu_info.smt_active) && (prev_mm & LAST_USER_MM_L1D_FLUSH)) l1d_flush_hw(); That fails to flush when SMT is disabled globally. Balbir? Of course this should be: if (!this_cpu_read(cpu_info.smt_active) && (prev_mm & LAST_USER_MM_L1D_FLUSH)) l1d_flush_hw(); Now we can make this: if (unlikely(prev_mm & LAST_USER_MM_L1D_FLUSH)) { if (!this_cpu_read(cpu_info.smt_active)) l1d_flush_hw(); else task_work_add(...); And that task work clears the flag and sends a signal. We're not going to send a signal from switch_mm() .... Thanks, tglx