On Wed, Mar 07, 2012 at 12:01:38AM +0100, Alexander Graf wrote: > > On 31.01.2012, at 02:17, Takuya Yoshikawa wrote: > > > Added s390 and ppc developers to Cc, > > > > (2012/01/30 14:35), Takuya Yoshikawa wrote: > >> Some members of kvm_memory_slot are not used by every architecture. > >> > >> This patch is the first step to make this difference clear by > >> introducing kvm_memory_slot::arch; lpage_info is moved into it. > > > > I am planning to move rmap stuff into arch next if this patch is accepted. > > > > Please let me know if you have some opinion about which members should be > > moved into this. > > What is this lpage stuff? When do we need it? Right now the code > gets executed on ppc, right? And with the patch it doesn't, no? We do support large pages backing the guest on powerpc, at least for the Book3S_HV style of KVM, but we don't use the lpage_info array. The reason is that we only allow the guest to create large-page PTEs in regions which are backed by large pages on the host side (and which are therefore large-page aligned on both the host and guest side). We can enforce that because guests use a hypercall to create PTEs in the hashed page table, and we have a way (via the device tree) to tell the guest what page sizes it can use. In contrast, on x86 we have no control over what PTEs the guest creates in its page tables, so it can create large-page PTEs inside a region which is backed by small pages, and which might not be large-page aligned. This is why we have the separate arrays pointed to by lpage_info and why there is the logic in kvm_main.c for handling misalignment at the ends. So, at the moment on Book3S_HV, I have one entry in the rmap array for each small page in a memslot. Each entry is an unsigned long and contains some control bits (dirty and referenced bits, among others) and the index in the hashed page table (HPT) of one guest PTE that references that page. There is another array that then forms a doubly-linked circular list of all the guest PTEs that reference the page. At present, guest PTEs are linked into the rmap lists based on the starting address of the page irrespective of the page size, so a large-page guest PTE gets linked into the same list as a small-page guest PTE mapping the first small page of the large page. That isn't ideal from the point of view of dirty and reference tracking, so I will probably move to having separate lists for the different page sizes, meaning I will need something like the lpage_info array, but I won't need the logic that is currently in kvm_main.c for handling it. Paul. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html