Avi Kivity wrote: > On 06/17/2010 10:25 AM, Xiao Guangrong wrote: >> >> >>> Can this in fact work for level != PT_PAGE_TABLE_LEVEL? We might start >>> at PT_PAGE_DIRECTORY_LEVEL but get 4k pages while iterating. >>> >> Ah, i forgot it. We can't assume that the host also support huge page for >> next gfn, as Marcelo's suggestion, we should "only map with level> 1 if >> the host page matches the size". >> >> Um, the problem is, when we get host page size, we should hold >> 'mm->mmap_sem', >> it can't used in atomic context and it's also a slow path, we hope pte >> prefetch >> path is fast. >> >> How about only allow prefetch for sp.leve = 1 now? i'll improve it in >> the future, >> i think it need more time :-) >> > > I don't think prefetch for level > 1 is worthwhile. One fault per 2MB > is already very good, no need to optimize it further. > OK >>>> + >>>> + pfn = gfn_to_pfn_atomic(vcpu->kvm, gfn); >>>> + if (is_error_pfn(pfn)) { >>>> + kvm_release_pfn_clean(pfn); >>>> + break; >>>> + } >>>> + if (pte_prefetch_topup_memory_cache(vcpu)) >>>> + break; >>>> + >>>> + mmu_set_spte(vcpu, spte, ACC_ALL, ACC_ALL, 0, 0, 1, NULL, >>>> + sp->role.level, gfn, pfn, true, false); >>>> + } >>>> +} >>>> >>>> >>> Nice. Direct prefetch should usually succeed. >>> >>> Can later augment to call get_users_pages_fast(..., PTE_PREFETCH_NUM, >>> ...) to reduce gup overhead. >>> >> But we can't assume the gfn's hva is consecutive, for example, gfn and >> gfn+1 >> maybe in the different slots. >> > > Right. We could limit it to one slot then for simplicity. OK, i'll do it. > >> >>>> + >>>> + if (!table) { >>>> + page = gfn_to_page_atomic(vcpu->kvm, sp->gfn); >>>> + if (is_error_page(page)) { >>>> + kvm_release_page_clean(page); >>>> + break; >>>> + } >>>> + table = kmap_atomic(page, KM_USER0); >>>> + table = (pt_element_t *)((char *)table + offset); >>>> + } >>>> >>>> >>> Why not kvm_read_guest_atomic()? Can do it outside the loop. >>> >> Do you mean that read all prefetched sptes at one time? >> > > Yes. > >> If prefetch one spte fail, the later sptes that we read is waste, so i >> choose read next spte only when current spte is prefetched successful. >> >> But i not have strong opinion on it since it's fast to read all sptes at >> one time, at the worst case, only 16 * 8 = 128 bytes we need to read. >> > > In general batching is worthwhile, the cost of the extra bytes is low > compared to the cost of bringing in the cacheline and error checking. > Agreed. > btw, you could align the prefetch to 16 pte boundary. That would > improve performance for memory that is scanned backwards. > Yeah, good idea. > So we can change the fault path to always fault 16 ptes, aligned on 16 > pte boundary, with the needed pte called with specualtive=false. Avi, i not understand it clearly, Could you please explain it? :-( -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html