On 2023-05-17 13:05:32 [+0200], Ard Biesheuvel wrote: > For context, these changes give a substantial performance boost on the > RX path to IPsec or WireGuard using SIMD based software crypto. > > > … > > > Please take a look, and confirm whether or not we still need to drop > > > the get_cpu() call from vfp_sync_hwstate() > > > > Yes, it needs to be dealt with. At the time vfp_sync_hwstate() the > > context is fully preemptible. So that get_cpu() needs to be removed > > before local_bh_disable(). But… > > > > If I understand the logic right, then the VFP state is saved on context > > switch and the FPU access is disabled but the content remains and is in > > sync with vfp_current_hw_state. Upon first usage (after the context > > switch) in userland there is an exception at which point the access to > > the FPU is enabled (FPEXC_EX) and the FPU content is loaded unless it is > > in sync as per vfp_current_hw_state. > > > > Exactly. On UP we lazily preserve the user space FP state, and on SMP > we always preserve it on a context switch. Then, only when user space > actually tries to use the SIMD context, the correct data is copied in. > That way, we don't take the performance hit for tasks that are blocked > in the kernel. > > > That means it relies on disabled preemption while operating on > > vfp_current_hw_state. > > In general, this code relies on preserving/restoring the VFP context > to/from memory to be a critical section, as it needs to run to > completion once it is started. > > > On PREEMPT_RT softirqs are only handled in process > > context (for instance at the end of the threaded interrupt). Therefore > > it is sufficient to disable preemption instead of BH. It would need > > something like x86's fpregs_lock(). > > I think this is not the only issue, to be honest. We cannot preempt > tasks while they are using the SIMD unit in kernel mode, as the kernel > mode context is never preserved/restored. So we at least need to > disable preemption again in kernel_neon_begin() [which now relies on > BH disable to disable preemption but as you point out, that is only > the case on !RT] > > Given this, I wonder whether supporting kernel mode NEON on PREEMPT_RT > is worth it to begin with. AIUI, the alternative is to disable both > preemption and BH when touching the VFP state. Only preemption on RT. So it is debatable if it makes sense to use NEON on RT. The preempt-off section is limited to PAGE_SIZE due the scatter/ gather walk if I remember correctly. > > Otherwise vfp_entry() can get preempted after decisions based on > > vfp_current_hw_state have been made. Also kernel_neon_begin() could get > > preempted after enabling FPU. I think based on current logic, after > > kernel_neon_begin() a task can get preempted on PREEMPT_RT and since > > vfp_current_hw_state is NULL the registers won't be saved and a zero set > > of registers will be loaded once the neon task gets back on the CPU (it > > seems that VFP exceptions in kernel mode are treated the same way as > > those in user mode). > > > > What do you think? > > > > Indeed. kernel_neon_begin() no longer disabling preemption is > definitely wrong on PREEMPT_RT. The question is whether we want to > untangle this or just make PREEMPT_RT and KERNEL_MODE_NEON mutually > exclusive. This, by itself, makes quite a lot of sense, actually, > given that on actual 32-bit hardware, the SIMD speedup is not that > spectacular (the AES and other crypto instructions, which do give an > order of magnitude speedup are only implemented on 64-bit cores, which > usually run a 64-bit kernel). So if real-time behavior is a > requirement, using ordinary crypto implemented in C (which does not > require preemption to be disabled) is likely preferred anyway. I have no opinion which can be backed up by numbers. So I have no problem to go along with it and disable KERNEL_MODE_NEON since it is simpler and the benefit is questionable. So patch - #1 KERNEL_MODE_NEON depends on !RT - #2 disable BH followed by preemption in vfp_sync_hwstate() and vfp_entry(). if so, let me prepare something unless you want to :) Sebastian