Re: RFC: New API for PPC for vcpu mmu access

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02.02.2011, at 21:33, Yoder Stuart-B08248 wrote:

> Below is a proposal for a new API for PPC to allow KVM clients
> to set MMU state in a vcpu.
> 
> BookE processors have one or more software managed TLBs and
> currently there is no mechanism for Qemu to initialize
> or access them.  This is needed for normal initialization
> as well as debug.
> 
> There are 4 APIs:
> 
> -KVM_PPC_SET_MMU_TYPE allows the client to negotiate the type
> of MMU with KVM-- the type determines the size and format
> of the data in the other APIs

This should be done through the PVR hint in sregs, no? Usually a single CPU type only has a single MMU type.

> -KVM_PPC_INVALIDATE_TLB invalidates all TLB entries in all
> TLBs in the vcpu
> 
> -KVM_PPC_SET_TLBE sets a TLB entry-- the Power architecture
> specifies the format of the MMU data passed in

This seems to fine-grained. I'd prefer a list of all TLB entries to be pushed in either direction. What's the foreseeable number of TLB entries within the next 10 years?

Having the whole stack available would make the sync with qemu easier and also allows us to only do a single ioctl for all the TLB management. Thanks to the PVR we know the size of the TLB, so we don't have to shove that around.


> -KVM_PPC_GET_TLB allows searching, reading a specific TLB entry,
> or iterating over an entire TLB.  Some TLBs may have an unspecified
> geometry and thus the need to be able to iterate in order
> to dump the TLB.  The Power architecture specifies the format
> of the MMU data
> 
> Feedback welcome.
> 
> Thanks,
> Stuart Yoder
> 
> ------------------------------------------------------------------
> 
> KVM PPC MMU API
> ---------------
> 
> User space can query whether the APIs to access the vcpu mmu
> is available with the KVM_CHECK_EXTENSION API using
> the KVM_CAP_PPC_MMU argument.
> 
> If the KVM_CAP_PPC_MMU return value is non-zero it specifies that
> the following APIs are available:
> 
>   KVM_PPC_SET_MMU_TYPE
>   KVM_PPC_INVALIDATE_TLB
>   KVM_PPC_SET_TLBE
>   KVM_PPC_GET_MMU
> 
> 
> KVM_PPC_SET_MMU_TYPE
> --------------------
> 
> Capability: KVM_CAP_PPC_SET_MMU_TYPE
> Architectures: powerpc
> Type: vcpu ioctl
> Parameters: __u32 mmu_type (in)
> Returns: 0 if specified MMU type is supported, else -1
> 
> Sets the MMU type.  Valid input values are:
>   BOOKE_NOHV   0x1
>   BOOKE_HV     0x2
> 
> A return value of 0x0 indicates that KVM supports
> the specified MMU type.

We should probably return some failure code when a PVR gets set that KVM doesn't understand. That would automatically give us that functionality.

> 
> KVM_PPC_INVALIDATE_TLB
> ----------------------
> 
> Capability: KVM_CAP_PPC_MMU
> Architectures: powerpc
> Type: vcpu ioctl
> Parameters: none
> Returns: 0 on success, -1 on error
> 
> Invalidates all TLB entries in all TLBs of the vcpu.

The only reason we need to do this is because there's no proper reset function in qemu for the e500 tlb. I'd prefer to have that there and push the TLB contents down on reset.

> 
> KVM_PPC_SET_TLBE
> ----------------
> 
> Capability: KVM_CAP_PPC_MMU
> Architectures: powerpc
> Type: vcpu ioctl
> Parameters:
>        For mmu types BOOKE_NOHV and BOOKE_HV : struct kvm_ppc_booke_mmu (in)
> Returns: 0 on success, -1 on error
> 
> Sets an MMU entry in a virtual CPU.
> 
> For mmu types BOOKE_NOHV and BOOKE_HV:
> 
>      To write a TLB entry, set the mas fields of kvm_ppc_booke_mmu 
>      as per the Power architecture.
> 
>      struct kvm_ppc_booke_mmu {
>            union {
>                  __u64 mas0_1;
>                  struct {
>                        __u32 mas0;
>                        __u32 mas1;
>                  };
>            };
>            __u64 mas2;
>            union {
>                  __u64 mas7_3      
>                  struct {
>                        __u32 mas7;
>                        __u32 mas3;
>                  };
>            };
>            union {
>                  __u64 mas5_6      
>                  struct {
>                        __u64 mas5;
>                        __u64 mas6;
>                  };
>            }
>            __u32 mas8;
>      };
> 
>      For a mmu type of BOOKE_NOHV, the mas5 and mas8 fields
>      in kvm_ppc_booke_mmu are present but not supported.

Haven't fully made up my mind on the tlb entry structure yet. Maybe something like

struct kvm_ppc_booke_tlbe {
    __u64 data[8];
};

would be enough? The rest is implementation dependent anyways. Exposing those details to user space doesn't buy us anything. By keeping it generic we can at least still build against older kernel headers :).

> 
> 
> KVM_PPC_GET_TLB
> ---------------
> 
> Capability: KVM_CAP_PPC_MMU
> Architectures: powerpc
> Type: vcpu ioctl
> Parameters: struct kvm_ppc_get_mmu (in/out)
> Returns: 0 on success
>         -1 on error
>         errno = ENOENT when iterating and there are no more entries to read
> 
> Reads an MMU entry from a virtual CPU.
> 
>      struct kvm_ppc_get_mmu {
>            /* in */
>                void *mmu;
>            __u32 flags;
>                  /* a bitmask of flags to the API */
>                    /*     TLB_READ_FIRST   0x1      */
>                    /*     TLB_SEARCH       0x2      */
>            /* out */
>            __u32 max_entries;
>      };
> 
> For mmu types BOOKE_NOHV and BOOKE_HV :
> 
>      The "void *mmu" field of kvm_ppc_get_mmu points to 
>        a struct of type "struct kvm_ppc_booke_mmu".
> 
>      If TLBnCFG[NENTRY] > 0 and TLBnCFG[ASSOC] > 0, the TLB has
>      of known number of entries and associativity.  The mas0[ESEL]
>      and mas2[EPN] fields specify which entry to read.
> 
>      If TLBnCFG[NENTRY] == 0 the number of TLB entries is 
>      undefined and this API can be used to iterate over
>      the entire TLB selected with TLBSEL in mas0.
> 
>      -To read a TLB entry:
> 
>         set the following fields in the mmu struct (struct kvm_ppc_booke_mmu):
>            flags=0
>            mas0[TLBSEL] // select which TLB is being read
>            mas0[ESEL]   // select which entry is being read
>            mas2[EPN]    // effective address 
> 
>         On return the following fields are updated as per the Power architecture:
>            mas0
>            mas1 
>            mas2 
>            mas3 
>            mas7 
> 
>      -To iterate over a TLB (read all entries):
> 
>        To start an interation sequence, set the following fields in
>        the mmu struct (struct kvm_ppc_booke_mmu)
>            flags=TLB_READ_FIRST
>            mas0[TLBSEL]  // select which TLB is being read
> 
>        On return the following fields are updated:
>            mas0           // set as per Power arch
>            mas1           // set as per Power arch
>            mas2           // set as per Power arch
>            mas3           // set as per Power arch
>            mas7           // set as per Power arch
>            max_entries    // Contains upper limit on number of entries that may
>                           // be returned. A value of 0xffffffff means there is
>                           // no meaningful upper bound.
> 
>        For subsequent calls to the API the following output fields must
>        be passed back into the API unmodified:
>            flags
>            mas0
>            mas2
> 
>        A return value of -ENOENT indicates that there are no more
>        entries to be read.
> 
>      -To search for TLB entry
> 
>         To search for TLB entry, set the following fields in
>         the mmu struct (struct kvm_ppc_booke_mmu):
>            flags=TLB_SEARCH
>            mas2[EPN]    // effective address to search for
>            mas6         // set as per the Power arch
>            mas5         // set as per the Power arch
> 
>         On return, the following fields are updated as per the Power architecture:
>            mas0
>            mas1 
>            mas2 
>            mas3 
>            mas7 

Userspace should only really need the TLB entries for

  1) Debugging
  2) Migration

So I don't see the point in making the interface optimized for single TLB entries. Do you have other use cases in mind?


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux