On Thu, Jul 15 2021, David Hildenbrand <david@xxxxxxxxxx> wrote: > On 14.07.21 17:25, Pierre Morel wrote: >> STSI(15.1.x) gives information on the CPU configuration topology. >> Let's accept the interception of STSI with the function code 15 and >> let the userland part of the hypervisor handle it. >> >> Signed-off-by: Pierre Morel <pmorel@xxxxxxxxxxxxx> >> --- >> arch/s390/kvm/priv.c | 11 ++++++++++- >> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c >> index 9928f785c677..4ab5f8b7780e 100644 >> --- a/arch/s390/kvm/priv.c >> +++ b/arch/s390/kvm/priv.c >> @@ -856,7 +856,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu) >> if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE) >> return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP); >> >> - if (fc > 3) { >> + if (fc > 3 && fc != 15) { >> kvm_s390_set_psw_cc(vcpu, 3); >> return 0; >> } >> @@ -893,6 +893,15 @@ static int handle_stsi(struct kvm_vcpu *vcpu) >> goto out_no_data; >> handle_stsi_3_2_2(vcpu, (void *) mem); >> break; >> + case 15: >> + if (sel1 != 1 || sel2 < 2 || sel2 > 6) >> + goto out_no_data; >> + if (vcpu->kvm->arch.user_stsi) { >> + insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2); >> + return -EREMOTE; This bypasses the trace event further down. >> + } >> + kvm_s390_set_psw_cc(vcpu, 3); >> + return 0; >> } >> if (kvm_s390_pv_cpu_is_protected(vcpu)) { >> memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem, >> > > 1. Setting GPRS to 0 > > I was wondering why we have the "vcpu->run->s.regs.gprs[0] = 0;" > for existing fc 1,2,3 in case we set cc=0. > > Looking at the doc, all I find is: > > "CC 0: Requested configuration-level number placed in > general register 0 or requested SYSIB informa- > tion stored" > > But I don't find where it states that we are supposed to set > general register 0 to 0. Wouldn't we also have to do it for > fc=15 or for none? > > If fc 1,2,3 and 15 are to be handled equally, I suggest the following: > > diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c > index 9928f785c677..6eb86fa58b0b 100644 > --- a/arch/s390/kvm/priv.c > +++ b/arch/s390/kvm/priv.c > @@ -893,17 +893,23 @@ static int handle_stsi(struct kvm_vcpu *vcpu) > goto out_no_data; > handle_stsi_3_2_2(vcpu, (void *) mem); > break; > + case 15: > + if (sel1 != 1 || sel2 < 2 || sel2 > 6) > + goto out_no_data; > + break; > } > - if (kvm_s390_pv_cpu_is_protected(vcpu)) { > - memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem, > - PAGE_SIZE); > - rc = 0; > - } else { > - rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE); > - } > - if (rc) { > - rc = kvm_s390_inject_prog_cond(vcpu, rc); > - goto out; > + if (mem) { > + if (kvm_s390_pv_cpu_is_protected(vcpu)) { > + memcpy((void *)sida_origin(vcpu->arch.sie_block), > + (void *)mem, PAGE_SIZE); > + } else { > + rc = write_guest(vcpu, operand2, ar, (void *)mem, > + PAGE_SIZE); > + if (rc) { > + rc = kvm_s390_inject_prog_cond(vcpu, rc); > + goto out; > + } > + } > } > if (vcpu->kvm->arch.user_stsi) { > insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2); Something like that sounds good, the code is getting a bit convoluted. > > > 2. maximum-MNest facility > > " > 1. If the maximum-MNest facility is installed and > selector 2 exceeds the nonzero model-depen- > dent maximum-selector-2 value." > > 2. If the maximum-MNest facility is not installed and > selector 2 is not specified as two. > " > > We will we be handling the presence/absence of the maximum-MNest facility > (for our guest?) in QEMU, corect? > > I do wonder if we should just let any fc=15 go to user space let the whole > sel1 / sel2 checking be handled there. I don't think it's a fast path after all. > But no strong opinion. If that makes handling easier, I think it would be a good idea. > > How do we identify availability of maximum-MNest facility? > > > 3. User space awareness > > How can user space identify that we actually forward these intercepts? > How can it enable them? The old KVM_CAP_S390_USER_STSI capability > is not sufficient. Why do you think that it is not sufficient? USER_STSI basically says "you may get an exit that tells you about a buffer to fill in some more data for a stsi command, and we also tell you which call". If userspace does not know what to add for a certain call, it is free to just do nothing, and if it does not get some calls it would support, that should not be a problem, either? > > I do wonder if we want KVM_CAP_S390_USER_STSI_15 or sth like that to change > the behavior once enabled by user space. > > > 4. Without vcpu->kvm->arch.user_stsi, we indicate cc=0 to our guest, > also for fc 1,2,3. Is that actually what we want? (or do we simply not care > because the guest is not supposed to use stsi?) If returning an empty buffer is ok, it should not be a problem, I guess. (I have not looked yet at the actual definitions.)