On Thu, Sep 19, 2024 at 1:34 AM Peter Maydell <peter.maydell@xxxxxxxxxx> wrote: > > On Wed, 18 Sept 2024 at 14:49, Heinrich Schuchardt > <heinrich.schuchardt@xxxxxxxxxxxxx> wrote: > > > > On 18.09.24 15:12, Peter Maydell wrote: > > > On Wed, 18 Sept 2024 at 14:06, Heinrich Schuchardt > > > <heinrich.schuchardt@xxxxxxxxxxxxx> wrote: > > >> Thanks Peter for looking into this. > > >> > > >> QEMU's cpu_synchronize_all_post_init() and > > >> do_kvm_cpu_synchronize_post_reset() both end up in > > >> kvm_arch_put_registers() and that is long after Linux > > >> kvm_arch_vcpu_create() has been setting some FPU state. See the output > > >> below. > > >> > > >> kvm_arch_put_registers() copies the CSRs by calling > > >> kvm_riscv_put_regs_csr(). Here we can find: > > >> > > >> KVM_RISCV_SET_CSR(cs, env, sstatus, env->mstatus); > > >> > > >> This call enables or disables the FPU according to the value of > > >> env->mstatus. > > >> > > >> So we need to set the desired state of the floating point unit in QEMU. > > >> And this is what the current patch does both for TCG and KVM. > > > > > > If it does this for both TCG and KVM then I don't understand > > > this bit from the commit message: > > > > > > # Without this patch EDK II with TLS enabled crashes when hitting the first > > > # floating point instruction while running QEMU with --accel kvm and runs > > > # fine with --accel tcg. > > > > > > Shouldn't this guest crash the same way with both KVM and TCG without > > > this patch, because the FPU state is the same for both? > > > By default `qemu-system-riscv64 --accel tcg` runs OpenSBI as firmware > > which enables the FPU. > > > > If you would choose a different SBI implementation which does not enable > > the FPU you could experience the same crash. > > Ah, so KVM vs TCG is a red herring and it's actually "some guest > firmware doesn't enable the FPU itself, and if you run that then it will > fall over, whether you do it in KVM or TCG" ? That makes more sense. > > I don't have an opinion on whether you want to do that or not, > not knowing what the riscv architecture mandates. (On Arm this > would be fairly clearly "the guest software is broken and > should be fixed", but that's because the Arm architecture > says you can't assume the FPU is enabled from reset.) RISC-V is the same. Section "3.4 Reset" states that: "All other hart state is UNSPECIFIED." (the paragraph doesn't mention the FS state). So it's unspecified what the value is on reset. Guest software shouldn't assume anything about it and it does seem like a guest software bug. In saying that, we are allowed to set it then as the spec doesn't say it should be 0. So setting it to 0x01 (Initial) doesn't seem like a bad idea, as the name kind of implies that it should be 0x01 on reset Alistair > > I do think the commit message could use clarification to > explain this. > > thanks > -- PMM >