On Fri, Nov 4, 2022 at 1:45 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Thu, Nov 03, 2022 at 10:53:54PM +0000, Andrew Cooper wrote: > > On 21/10/2022 16:21, Nathan Chancellor wrote: > > > On Fri, Oct 21, 2022 at 11:53:09AM +0200, Peter Zijlstra wrote: > > >> On Thu, Oct 20, 2022 at 04:10:28PM -0700, Nathan Chancellor wrote: > > >>> This commit is now in -next as commit 5d8213864ade ("x86/retbleed: Add > > >>> SKL return thunk"). I just bisected an immediate reboot on my AMD test > > >>> system when starting a virtual machine with QEMU + KVM to it (see the > > >>> bisect log below). My Intel test systems do not show this. > > >>> Unfortunately, I do not have much more information, as there are no logs > > >>> in journalctl, which makes sense as the reboot occurs immediately after > > >>> I hit the enter key for the QEMU command. > > >>> > > >>> If there is any further information I can provide or patches I can test > > >>> for further debugging, I am more than happy to do so. > > >> Moo :-( > > >> > > >> you happen to have a .config for me? > > > Sure thing, sorry I did not provide it in the first place! Attached. It > > > has been run through localmodconfig for the particular machine but I > > > assume the core pieces should still be present. > > > > Following up from some debugging on IRC. > > > > The problem is that FILL_RETURN_BUFFER now has a per-cpu variable > > access, and AMD SVM has a fun optimisation where the VMRUN instruction > > doesn't swap, amongst other things, %gs. > > > > per-cpu variables only become safe following > > vmload(__sme_page_pa(sd->save_area)); in svm_vcpu_enter_exit(). > > > > Given that retbleed=force ought to work on non-skylake hardware, the > > appropriate fix is to move the VMLOAD/VMSAVE's down into asm and put > > them adjacent to VMRUN. > > > > This also addresses an undocumented dependency where its only the memory > > clobber in vmload() which stops the compiler moving > > svm_vcpu_enter_exit()'s calculation of sd into an unsafe position. > > So, aside from wasting the entire morning on resuscitating my AMD > Interlagos, I ended up with the below patch which seems to work. > > Not being a virt person, I'm sure I've messed up something, please > advise. Oh, that was fast. I was doing similar stuff to move MSR_IA32_SPEC_CTRL save/restore to assembly, because we're not sure it's safe to do the restore in C code, and there is overlap with this change. I'll get it out today. The main issue in the patch below is that _ASM_ARG4 does not exist on 32-bits, and also _ASM_ARG3 is kinda offlimits because I need it for the aforementioned MSR_IA32_SPEC_CTRL change. Otherwise it's similar to my change. Paolo