On 19.05.2011, at 18:28, Scott Wood wrote: > On Thu, 19 May 2011 16:10:48 +0200 > Alexander Graf <agraf@xxxxxxx> wrote: > >> On 05/18/2011 01:42 AM, Scott Wood wrote: >>> +For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV: >> >> Can we rename those KVM_MMU_BOOKE_206_NOHV and KVM_MMU_BOOKE_206_HV? >> Same for the function names. It's just easier to read and more accurate. >> Also I don't like how it says KVM_MMU_FSL :). It's really a booke interface. > > That's more or less how I had it initially. In our earlier discussion I > changed it to FSL because of assumptions about TLB layout, such as the set > hash: > > http://www.spinics.net/lists/kvm-ppc/msg02641.html > http://www.spinics.net/lists/kvm-ppc/msg02645.html Ah, that discussion was about calling it book3e, and that name is too generic. If the interface could satisfy any book3e 2.06 MMU, then calling it BOOKE_206 would be just fine. If it can only handle e500, call it FSL or E500 - whichever fits you better. I'd prefer if the interface was clever enough to also be used by non-FSL 2.06 implementations, in case any of those follow, if it's not too much additional overhead. > >>> + - The "params" field is of type "struct kvm_fsl_booke_tlb_params". >> >> I suppose params is a pointer? > > Yes. > >>> + - The "array" field points to an array of type "struct >>> + kvm_fsl_booke_tlb_entry". >>> + - The array consists of all entries in the first TLB, followed by all >>> + entries in the second TLB. >>> + - Within TLB0 entries are ordered first by increasing set number. Within a >>> + set, entries are ordered by way (increasing ESEL). >>> + The hash for determining set number is: (MAS2>> 12)& (num_sets - 1) >>> + where "num_sets" is the tlb_sizes[] value divided by the tlb_ways[] value. >>> + - Within TLB1, entries are ordered by increasing ESEL. >> >> Don't special-case it. TLB1 basically contains of only a single set. The >> MMU interface here shouldn't be too e500 specific :). > > How do I define the hash for determining set number without being > e500-specific? Hm. Isn't the hash always the way you put it above? Maybe I misread the spec :). The only obvious thing it states there are the corner cases O_o. > >>> diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h >>> index d2ca5ed..2419be2 100644 >>> --- a/arch/powerpc/include/asm/kvm.h >>> +++ b/arch/powerpc/include/asm/kvm.h >>> @@ -272,4 +272,39 @@ struct kvm_guest_debug_arch { >>> #define KVM_INTERRUPT_UNSET -2U >>> #define KVM_INTERRUPT_SET_LEVEL -3U >>> >>> +struct kvm_fsl_booke_tlb_entry { >>> + __u32 mas8; >>> + __u32 mas1; >>> + __u64 mas2; >>> + __u64 mas7_3; >> >> Is there a good reason to not include the other MAS registers as defined >> by the architecture? That'd make the interface easily extensible for the >> future and should only eat a few additional bytes, no? > > Those registers aren't part of a TLB entry, they're used for identifying > the slot or searching. Ah, yeah :). Sorry for the fuss. > >>> +struct kvm_fsl_booke_tlb_params { >>> + /* >>> + * book3e defines 4 TLBs, but current FSL Book E chips implement >>> + * only the first two. The second two entries in tlb_sizes[] >>> + * and tlb_ways[] are reserved and must be zero. >> >> I don't think that part is necessary. It's really an implementation detail. > > It is not an implementation detail. KVM on FSL Book E does not support > TLB2 or TLB3. I had more generic wording when it was not FSL specific, > where the supported TLBs were determined by reading the TLB config > registers. Either the struct is 100% e500 specific, then you only need 2 TLBs in there, or it's generic, then the actual implementation of TLB2 and TLB3 are open and as of yet unimplemented, but not reserved for eternity :). > >>> + * >>> + * A tlb_ways value of zero means the array is fully associative. >>> + * Only TLB0 may be configured with a different associativity. The >>> + * number of ways of TLB0 must be a power of two between 2 and 16. >>> + * >>> + * The size of TLB0 must be a multiple of the number of ways, and >>> + * the number of sets must be a power of two. >>> + * >>> + * The size of TLB1 may not exceed 64 entries. >> >> Why not? > > I wanted to set some limit, at least as long as we process TLB1 linearly > rather than use some fancy data structure. 64 is the most we currently > have in hardware, and I don't think we have any plans to increase that > while maintaining full associativity. > > If we need to change it later we can -- either with a capability, or by > having qemu iterate to find the max. Ok :). > >>> + * KVM will adjust TLBnCFG based on the sizes configured here, >>> + * though arrays greater than 2048 entries will have TLBnCFG[NENTRY] >>> + * set to zero. >>> + * >>> + * TLB0 supports 4 KiB pages. >>> + * The page sizes supported by TLB1 are as indicated by >>> + * TLB1CFG (if MMUCFG[MAVN] = 0) or TLB1PS (if MMUCFG[MAVN] = 1). >> >> This is not part of the interface, is it? > > It provides a way for qemu to discover what page sizes KVM will deal with. Hrm. So where does it get MMUCFG from? > >>> @@ -956,51 +994,229 @@ void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid) >>> >>> void kvmppc_e500_tlb_setup(struct kvmppc_vcpu_e500 *vcpu_e500) >>> { >>> - struct tlbe *tlbe; >>> + struct kvm_fsl_booke_tlb_entry *tlbe; >>> >>> /* Insert large initial mapping for guest. */ >>> - tlbe =&vcpu_e500->gtlb_arch[1][0]; >>> + tlbe = get_entry(vcpu_e500, 1, 0); >>> tlbe->mas1 = MAS1_VALID | MAS1_TSIZE(BOOK3E_PAGESZ_256M); >>> tlbe->mas2 = 0; >>> - tlbe->mas3 = E500_TLB_SUPER_PERM_MASK; >>> - tlbe->mas7 = 0; >>> + tlbe->mas7_3 = E500_TLB_SUPER_PERM_MASK; >>> >>> /* 4K map for serial output. Used by kernel wrapper. */ >> >> Which kernel wrapper? At the end of the day, these initial TLB values >> should be pushed down from Qemu btw. They really don't belong here. A >> different guest (u-boot for example) might need completely different >> mappings. > > That was not added by this patchset... and yes, the plan is for qemu to > set the inital TLB using the MMU interface. The entries created here are > just for legacy qemu. It just came to mind when looking through the patch - of course the code needs to stay for old user space, but I wanted to make sure we're on the same page wrt where we're going with TLB inits :). > >>> + >>> +int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu, >>> + struct kvm_config_tlb *cfg) >>> +{ >>> + struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); >>> + struct kvm_fsl_booke_tlb_params params; >>> + char *virt; >>> + struct page **pages; >>> + struct tlbe_priv *privs[2] = {}; >>> + size_t array_len; >>> + u32 sets; >>> + int num_pages, ret, i; >>> + >>> + if (cfg->mmu_type != KVM_MMU_FSL_BOOKE_NOHV) >>> + return -EINVAL; >>> + >>> + if (copy_from_user(¶ms, (void __user *)(uintptr_t)cfg->params, >>> + sizeof(params))) >>> + return -EFAULT; >>> + >>> + if (params.tlb_sizes[1]> 64) >>> + return -EINVAL; >>> + if (params.tlb_sizes[2] != 0 || params.tlb_sizes[3] != 0) >>> + return -EINVAL; >>> + if (params.tlb_ways[1] != 0 || params.tlb_ways[2] != 0 | >> >> Hrm - I always thought that basically the TLB ID would be: >> >> (some bits of the EA) || (ESEL) >> >> So in the TLB1 case, "some bits" would mean "no bits" and ESEL is all of >> the information we get. And ESEL is ways, no? > > TLBnCFG allows ASSOC to be either zero or equal to NENTRY to indicate > full associativity. In this API I chose just one way of representing it > for simplicity. I can pick ASSOC = NENTRY instead if you prefer. Ah, ok. I'd find it easier to understand if we just always stick to (EA) || (ESEL) :). Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html