[PATCH v3 8/8] KVM-doc: Add paravirt tlb flush document

"Nikunj A. Dadhania" <nikunj@xxxxxxxxxxxxxxxxxx> · Tue, 31 Jul 2012 16:19:30 +0530

Signed-off-by: Nikunj A. Dadhania <nikunj@xxxxxxxxxxxxxxxxxx>
---
 Documentation/virtual/kvm/msr.txt                |    4 ++
 Documentation/virtual/kvm/paravirt-tlb-flush.txt |   53 ++++++++++++++++++++++
 2 files changed, 57 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/virtual/kvm/paravirt-tlb-flush.txt

diff --git a/Documentation/virtual/kvm/msr.txt b/Documentation/virtual/kvm/msr.txt
index 7304710..92a6af6 100644
--- a/Documentation/virtual/kvm/msr.txt
+++ b/Documentation/virtual/kvm/msr.txt
@@ -256,3 +256,7 @@ MSR_KVM_EOI_EN: 0x4b564d04
 	guest must both read the least significant bit in the memory area and
 	clear it using a single CPU instruction, such as test and clear, or
 	compare and exchange.
+
+MSR_KVM_VCPU_STATE: 0x4b564d05
+
+Refer: Documentation/virtual/kvm/paravirt-tlb-flush.txt
diff --git a/Documentation/virtual/kvm/paravirt-tlb-flush.txt b/Documentation/virtual/kvm/paravirt-tlb-flush.txt
new file mode 100644
index 0000000..0eaabd7
--- /dev/null
+++ b/Documentation/virtual/kvm/paravirt-tlb-flush.txt
@@ -0,0 +1,53 @@
+KVM - Paravirt TLB Flush
+Nikunj A Dadhania <nikunj@xxxxxxxxxxxxxxxxxx>, IBM, 2012
+========================================================
+
+Remote flushing api's does a busy wait which is fine in bare-metal
+scenario. But with-in the guest, the vcpus might have been pre-empted
+or blocked. In this scenario, the initator vcpu would end up
+busy-waiting for a long amount of time.
+
+This would require to have information of guest running/not-running
+within the guest to take a decision. The following MSR introduces vcpu
+running state information.
+
+Using this MSR we have implemented para-virt flush tlbs making sure
+that it does not wait for vcpus that are not-running. And TLB flushing
+for them is deferred, which is done on guest enter.
+
+MSR_KVM_VCPU_STATE: 0x4b564d04
+
+	data: 64-byte alignment physical address of a memory area which must be
+	in guest RAM, plus an enable bit in bit 0. This memory is expected to
+	hold a copy of the following structure:
+
+	struct kvm_steal_time {
+		__u64 state;
+		__u32 pad[14];
+	}
+
+	whose data will be filled in by the hypervisor/guest. Only one
+	write, or registration, is needed for each VCPU.  The interval
+	between updates of this structure is arbitrary and
+	implementation-dependent.  The hypervisor may update this
+	structure at any time it sees fit until anything with bit0 ==
+	0 is written to it. Guest is required to make sure this
+	structure is initialized to zero.
+
+	This would enable a VCPU to know running status of sibling
+	VCPUs. The information can further be used to determine if an
+	IPI needs to be send to the non-running VCPU and wait for them
+	unnecessarily. For e.g. flush_tlb_others_ipi.
+
+	Fields have the following meanings:
+
+		state: has bit  following fields:
+
+		Bit 0 - vcpu running state. Hypervisor would set vcpu
+		      	running/not running. Value 1 meaning the vcpu
+		      	is running and value 0 means vcpu is
+		      	pre-empted out.
+
+		Bit 1 - hypervisor should flush tlb is set during
+		        guest enter/exit
+

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html