Re: [PATCH v2 2/2] s390/kvm: handle diagnose 318 instruction call

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/7/18 11:33 AM, David Hildenbrand wrote:
> On 06.12.18 23:30, Collin Walling wrote:
>> The diagnose 318 instruction is a privileged instruction that must be
>> interpreted by SIE and handled via KVM.
>>
>> The control program name and version codes (CPNC and CPVC) set by this
>> instruction are saved to the kvm->arch struct. The CPNC is also set in
>> the SIE control block of all VCPUs. The new kvm_s390_set_machine
>> interface is introduced for migration.
>>
>> Signed-off-by: Collin Walling <walling@xxxxxxxxxxxxx>
>> Reviewed-by: Janosch Frank <frankja@xxxxxxxxxxxxx>
>> ---
>>  arch/s390/include/asm/kvm_host.h | 13 +++++-
>>  arch/s390/include/uapi/asm/kvm.h |  5 +++
>>  arch/s390/kvm/diag.c             | 12 ++++++
>>  arch/s390/kvm/kvm-s390.c         | 92 ++++++++++++++++++++++++++++++++++++++++
>>  arch/s390/kvm/kvm-s390.h         |  1 +
>>  arch/s390/kvm/vsie.c             |  7 +++
>>  6 files changed, 129 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
>> index d5d2488..ad42949 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -229,7 +229,8 @@ struct kvm_s390_sie_block {
>>  	__u32	scaol;			/* 0x0064 */
>>  	__u8	reserved68;		/* 0x0068 */
>>  	__u8    epdx;			/* 0x0069 */
>> -	__u8    reserved6a[2];		/* 0x006a */
>> +	__u8    cpnc;			/* 0x006a */
> 
> So, no field for cpvc? Can you give me a short summary how these values
> are to be used. E.g. who will use the cpvc? Who can actually read both
> values. A guest VCPU? Firmware? (well firmware obviously has no access
> to cpvc)
> 

Only the CPNC is architected to be stored in the SIE block... CPVC has to be held elsewhere.

The CPNC can be seen during a ring dump, which is helpful when a problem causes a CPU to
enter a check-stop state and we can't get a CEC dump.

We can really only see the CPVC from the host's perspective, perhaps during a guest dump 
or via VM_EVENT messages in the guest debug logs.

>> +	__u8	reserved6b;		/* 0x006b */
>>  	__u32	todpr;			/* 0x006c */
>>  #define GISA_FORMAT1 0x00000001
>>  	__u32	gd;			/* 0x0070 */
>> @@ -391,6 +392,7 @@ struct kvm_vcpu_stat {
>>  	u64 diagnose_9c;
>>  	u64 diagnose_258;
>>  	u64 diagnose_308;
>> +	u64 diagnose_318;
>>  	u64 diagnose_500;
>>  	u64 diagnose_other;
>>  };
>> @@ -804,6 +806,14 @@ struct kvm_s390_vsie {
>>  	struct page *pages[KVM_MAX_VCPUS];
>>  };
>>  
>> +union kvm_s390_diag318_info {
>> +	u64 cpc;
>> +	struct {
>> +		u64 cpnc : 8;
>> +		u64 cpvc : 56;
>> +	};
>> +};
>> +
>>  struct kvm_arch{
>>  	void *sca;
>>  	int use_esca;
>> @@ -838,6 +848,7 @@ struct kvm_arch{
>>  	/* subset of available cpu features enabled by user space */
>>  	DECLARE_BITMAP(cpu_feat, KVM_S390_VM_CPU_FEAT_NR_BITS);
>>  	struct kvm_s390_gisa *gisa;
>> +	union kvm_s390_diag318_info diag318_info;
>>  };
>>  
>>  #define KVM_HVA_ERR_BAD		(-1UL)
>> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
>> index 16511d9..6420aad 100644
>> --- a/arch/s390/include/uapi/asm/kvm.h
>> +++ b/arch/s390/include/uapi/asm/kvm.h
>> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
>>  #define KVM_S390_VM_CRYPTO		2
>>  #define KVM_S390_VM_CPU_MODEL		3
>>  #define KVM_S390_VM_MIGRATION		4
>> +#define KVM_S390_VM_MACHINE		5
>>  
>>  /* kvm attributes for mem_ctrl */
>>  #define KVM_S390_VM_MEM_ENABLE_CMMA	0
>> @@ -130,6 +131,7 @@ struct kvm_s390_vm_cpu_machine {
>>  #define KVM_S390_VM_CPU_FEAT_PFMFI	11
>>  #define KVM_S390_VM_CPU_FEAT_SIGPIF	12
>>  #define KVM_S390_VM_CPU_FEAT_KSS	13
>> +#define KVM_S390_VM_CPU_FEAT_DIAG318	14
>>  struct kvm_s390_vm_cpu_feat {
>>  	__u64 feat[16];
>>  };
>> @@ -168,6 +170,9 @@ struct kvm_s390_vm_cpu_subfunc {
>>  #define KVM_S390_VM_MIGRATION_START	1
>>  #define KVM_S390_VM_MIGRATION_STATUS	2
>>  
>> +/* kvm attributes for KVM_S390_VM_MACHINE */
>> +#define KVM_S390_VM_MACHINE_CPC	0
>> +
>>  /* for KVM_GET_REGS and KVM_SET_REGS */
>>  struct kvm_regs {
>>  	/* general purpose regs for s390 */
>> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
>> index 45634b3d..b61ffd2 100644
>> --- a/arch/s390/kvm/diag.c
>> +++ b/arch/s390/kvm/diag.c
>> @@ -235,6 +235,16 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu)
>>  	return ret < 0 ? ret : 0;
>>  }
>>  
>> +static int __diag_set_control_prog_name(struct kvm_vcpu *vcpu)
>> +{
>> +	unsigned int reg = (vcpu->arch.sie_block->ipa & 0xf0) >> 4;
>> +	unsigned long cpc = vcpu->run->s.regs.gprs[reg];
> 
> 1. This should be a u64.
> 
> 2. What if a !KVM_S390_VM_CPU_FEAT_DIAG318 ? Shouldn't there be an error
> reported to the guest? (diag subcode not installed)
> 

Since we're emulating the diag318 feature entirely for a guest, and we're able
to fence its availability via CPU models in QEMU (which is not obvious from the
KVM code, admittedly), we can safely allow this feature if KVM can support it.

With both KVM and QEMU support, the guest kernel will be able to pass the 
if (sclp.has_diag318) check in patch #1 and execute the diag instruction (with 
KVM handling) safely.

The whole implementation of these values is a bit tricky, I admit. I find myself
going back-and-forth to the docs to make sure I have this info correct.

Please let me know if there's a better way to convey this information to someone
who might have to look back on this code. Perhaps some documentation on the new 
VM attribute (suggested by Conny) would suffice.

>> +
>> +	vcpu->stat.diagnose_318++;
>> +	kvm_s390_set_cpc(vcpu->kvm, cpc);
>> +	return 0;
>> +}
>> +
>>  int kvm_s390_handle_diag(struct kvm_vcpu *vcpu)
>>  {
>>  	int code = kvm_s390_get_base_disp_rs(vcpu, NULL) & 0xffff;
>> @@ -254,6 +264,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu)
>>  		return __diag_page_ref_service(vcpu);
>>  	case 0x308:
>>  		return __diag_ipl_functions(vcpu);
>> +	case 0x318:
>> +		return __diag_set_control_prog_name(vcpu);
>>  	case 0x500:
>>  		return __diag_virtio_hypercall(vcpu);
>>  	default:
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index fe24150..67aa34a 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -157,6 +157,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
>>  	{ "instruction_diag_9c", VCPU_STAT(diagnose_9c) },
>>  	{ "instruction_diag_258", VCPU_STAT(diagnose_258) },
>>  	{ "instruction_diag_308", VCPU_STAT(diagnose_308) },
>> +	{ "instruction_diag_318", VCPU_STAT(diagnose_318) },
>>  	{ "instruction_diag_500", VCPU_STAT(diagnose_500) },
>>  	{ "instruction_diag_other", VCPU_STAT(diagnose_other) },
>>  	{ NULL }
>> @@ -371,6 +372,10 @@ static void kvm_s390_cpu_feat_init(void)
>>  
>>  	if (MACHINE_HAS_ESOP)
>>  		allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ESOP);
>> +
>> +	/* Enable DIAG318 guest support unconditionally */
>> +	allow_cpu_feat(KVM_S390_VM_CPU_FEAT_DIAG318);
> 
> Interesting. Is that the right thing to do?
> 
> While somebody can always write (__diag_set_control_prog_name emualtion
> handler), I don't see a read handler.
> 
> How is that supposed to work e.g. on older hardware?
> 
> (I was expecting HW to handle reading, that's why we add it to the SCB-
> Can you elaborate what the value stored in the SCB is actually used for?
> Is a guest CPU not actually able to read that value somehow?)
> 

A CPU (or VCPU) should have a copy of the CPNC stored in a local register. 
I'm sorry I can't be super helpful in my explanation, as I am not 100%
familiar with the ring dump process, but perhaps the CPU retrieves the
value from the SCB during a ring dump? I'll have to phone-a-friend on this
one.

Please see my explanation to your questions above and let me know if that
covers this question as well.

I think a more informative comment in the code would be helpful here.

>> +
>>  	/*
>>  	 * We need SIE support, ESOP (PROT_READ protection for gmap_shadow),
>>  	 * 64bit SCAO (SCA passthrough) and IDTE (for gmap_shadow unshadowing).
>> @@ -1173,6 +1178,75 @@ static int kvm_s390_get_tod(struct kvm *kvm, struct kvm_device_attr *attr)
>>  	return ret;
>>  }
>>  
>> +void kvm_s390_set_cpc(struct kvm *kvm, u64 cpc)
>> +{
>> +	struct kvm_vcpu *vcpu;
>> +	int i;
>> +
>> +	mutex_lock(&kvm->lock);
>> +	kvm_s390_vcpu_block_all(kvm);
> 
>> +
>> +	kvm->arch.diag318_info.cpc = cpc;
>> +
>> +	VM_EVENT(kvm, 3, "SET: cpnc: 0x%x cpvc: 0x%llx",
>> +		 (u8)kvm->arch.diag318_info.cpnc, (u64)kvm->arch.diag318_info.cpvc);
> 
> The second cast (u64) might not be necessary.
> 

You might be right. I'll double-check if the compiler yells at me.

>> +
>> +	kvm_for_each_vcpu(i, vcpu, kvm) {
>> +		vcpu->arch.sie_block->cpnc = kvm->arch.diag318_info.cpnc;
>> +	}
>> +
>> +	kvm_s390_vcpu_unblock_all(kvm);
> 
> Looks sane to me
> 

:)

>> +	mutex_unlock(&kvm->lock);
>> +}
>> +
>> +static int kvm_s390_vm_set_machine(struct kvm *kvm, struct kvm_device_attr *attr)
>> +{
>> +	int ret;
>> +	u64 cpc;
>> +
>> +	switch (attr->attr) {
>> +	case KVM_S390_VM_MACHINE_CPC:
>> +		ret = -EFAULT;
>> +		if (get_user(cpc, (u64 __user *)attr->addr))
>> +			break;
>> +		kvm_s390_set_cpc(kvm, cpc);
>> +		ret = 0;
>> +		break;
>> +	default:
>> +		ret = -ENXIO;
>> +		break;
>> +	}
>> +	return ret;
>> +}
>> +
>> +static int kvm_s390_get_cpc(struct kvm *kvm, struct kvm_device_attr *attr)
>> +{
>> +	u64 cpc = kvm->arch.diag318_info.cpc;
>> +
>> +	if (put_user(cpc, (u64 __user *)attr->addr))
>> +		return -EFAULT;
>> +
>> +	VM_EVENT(kvm, 3, "QUERY: cpnc: 0x%x cpvc: 0x%llx",
>> +		 (u8)kvm->arch.diag318_info.cpnc, (u64)kvm->arch.diag318_info.cpvc);
>> +
>> +	return 0;
>> +}
>> +
>> +static int kvm_s390_get_misc(struct kvm *kvm, struct kvm_device_attr *attr)
> 
> get_machine
> 

Whoops, thanks!

>> +{
>> +	int ret;
>> +
>> +	switch (attr->attr) {
>> +	case KVM_S390_VM_MACHINE_CPC:
>> +		ret = kvm_s390_get_cpc(kvm, attr);
>> +		break;
>> +	default:
>> +		ret = -ENXIO;
>> +		break;
>> +	}
>> +	return ret;
>> +}
>> +
>>  static int kvm_s390_set_processor(struct kvm *kvm, struct kvm_device_attr *attr)
>>  {
>>  	struct kvm_s390_vm_cpu_processor *proc;
>> @@ -1435,6 +1509,9 @@ static int kvm_s390_vm_set_attr(struct kvm *kvm, struct kvm_device_attr *attr)
>>  	case KVM_S390_VM_MIGRATION:
>>  		ret = kvm_s390_vm_set_migration(kvm, attr);
>>  		break;
>> +	case KVM_S390_VM_MACHINE:
>> +		ret = kvm_s390_vm_set_machine(kvm, attr);
>> +		break;
>>  	default:
>>  		ret = -ENXIO;
>>  		break;
>> @@ -1460,6 +1537,9 @@ static int kvm_s390_vm_get_attr(struct kvm *kvm, struct kvm_device_attr *attr)
>>  	case KVM_S390_VM_MIGRATION:
>>  		ret = kvm_s390_vm_get_migration(kvm, attr);
>>  		break;
>> +	case KVM_S390_VM_MACHINE:
>> +		ret = kvm_s390_get_misc(kvm, attr);
>> +		break;
>>  	default:
>>  		ret = -ENXIO;
>>  		break;
>> @@ -1534,6 +1614,16 @@ static int kvm_s390_vm_has_attr(struct kvm *kvm, struct kvm_device_attr *attr)
>>  	case KVM_S390_VM_MIGRATION:
>>  		ret = 0;
>>  		break;
>> +	case KVM_S390_VM_MACHINE:
>> +		switch (attr->attr) {
>> +		case KVM_S390_VM_MACHINE_CPC:
>> +			ret = 0;
>> +			break;
>> +		default:
>> +			ret = -ENXIO;
>> +			break;
>> +		}
>> +		break;
>>  	default:
>>  		ret = -ENXIO;
>>  		break;
>> @@ -2650,7 +2740,9 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
>>  	preempt_disable();
>>  	vcpu->arch.sie_block->epoch = vcpu->kvm->arch.epoch;
>>  	vcpu->arch.sie_block->epdx = vcpu->kvm->arch.epdx;
>> +	vcpu->arch.sie_block->cpnc = vcpu->kvm->arch.diag318_info.cpnc;
> 
> Move that out of the preempt section. That is only needed for the
> epoch/epdx. cpc is fully covered by the kvm lock.
> 

Will do.

>>  	preempt_enable();
>> +
>>  	mutex_unlock(&vcpu->kvm->lock);
>>  	if (!kvm_is_ucontrol(vcpu->kvm)) {
>>  		vcpu->arch.gmap = vcpu->kvm->arch.gmap;
>> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
>> index 1f6e36c..8137f15 100644
>> --- a/arch/s390/kvm/kvm-s390.h
>> +++ b/arch/s390/kvm/kvm-s390.h
>> @@ -281,6 +281,7 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu);
>>  int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu);
>>  
>>  /* implemented in kvm-s390.c */
>> +void kvm_s390_set_cpc(struct kvm *kvm, u64 cpc);
>>  void kvm_s390_set_tod_clock(struct kvm *kvm,
>>  			    const struct kvm_s390_vm_tod_clock *gtod);
>>  long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable);
>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>> index a153257..1a6a9b6 100644
>> --- a/arch/s390/kvm/vsie.c
>> +++ b/arch/s390/kvm/vsie.c
>> @@ -397,6 +397,9 @@ static void unshadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
>>  	memcpy(scb_o->gcr, scb_s->gcr, 128);
>>  	scb_o->pp = scb_s->pp;
>>  
>> +	/* machine information */
>> +	scb_o->cpnc = scb_s->cpnc;
> 
> Is unshadow needed? I think we have diag318 for that.
> 

Don't we need to make sure the CPNC in the SCB for a level 3+ guest is set correctly? I don't
know if we support anything other than "linux on linux on linux" so-to-speak, but if we do
then we need to make sure that when diag318 is called we correctly convey that value to the
correct SCB.

Correct me if I have my understanding of SIE / vSIE handling wrong here.

>> +
>>  	/* branch prediction */
>>  	if (test_kvm_facility(vcpu->kvm, 82)) {
>>  		scb_o->fpf &= ~FPF_BPBC;
>> @@ -473,6 +476,10 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
>>  	scb_s->lctl = scb_o->lctl;
>>  	scb_s->svcc = scb_o->svcc;
>>  	scb_s->ictl = scb_o->ictl;
>> +
>> +	/* machine information */
> 
> Please use as the comment the full name (e.g. control program name and
> version)
> 

Okay.

>> +	scb_s->cpnc = scb_o->cpnc;
>> +
>>  	/*
>>  	 * SKEY handling functions can't deal with false setting of PTE invalid
>>  	 * bits. Therefore we cannot provide interpretation and would later
>>
> 
> 

Thanks for the review.

-- 
Respectfully,
- Collin Walling




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux