> -----Original Message----- > From: Hollis Blanchard [mailto:hollisb@xxxxxxxxxx] > Sent: Friday, April 10, 2009 11:06 AM > To: Liu Yu-B13201 > Cc: kvm-ppc@xxxxxxxxxxxxxxx; Rahul Kulkarni > Subject: Re: [PATCH] Map guest (ts,tid) to individual shadow tid > > On Friday 10 April 2009 04:13:26 Liu Yu-B13201 wrote: > > > > > > > But there is one thing I haven't thought through. > > > > That is how should we handle guest kernel mapping (tid=0) > > > after apply this trick? > > > > As it's supposed to share by all PIDs. > > > > Any suggestion? > > > > > > Hmm, that's a good question. > > > > > > On e500, you could use the PID1 register, but that doesn't appear in > > > 440, and won't be available in future implementations anyways. > > > > > > In general, I don't know... I'm reluctant to sacrifice the other PID > > > hack we use, which allows us to flip between guest user and > > > guest kernel > > > without flushing the TLB. Since that's probably way more frequent than > > > context switches, maybe it's worth suffering the TLB flush on context > > > switch. > > > > > > It would be OK to map gPID 0 to another arbitrary host PID, except we > > > need copy_to_user() to work in the guest. To elaborate, if we had this > > > address space: > > > > > > 0 0xc 0xf > > > | gTID 3 | gTID 0 | gPID=3 > > > > > > we could implement it like this: > > > > > > 0 0xc 0xf > > > | sTID 27 | sTID 9 | > > > > > > and switch sPID between 9 and 27 when jumping between guest user and > > > guest kernel. However, if we're in the guest kernel (sTID=9 mappings) > > > with sPID=9, copy_to_user() will fault because sTID=27 mappings aren't > > > visible. > > > > > > When those faults occur, we could create a new userspace mapping with > > > sTID 9, aliasing the sTID 27 mappings. Would that work? Odds are we > > > won't get more than 1 or 2 of these at a time, but we'd need to track > > > these mappings so that guest updates to the sTID=27 mappings > > > also modify > > > or flush our extra sTID=9 mappings. > > > > > > That might be the only complexity, and might not be that bad. I don't > > > know, what do you think? If e500 can selectively flush by PID, maybe > > > that's better, but 440 can't so we'd want some scheme like I > > > described. > > > > > > > E500 cannot selectively flush by PID either... Seems e500mc could. > > It's a good idea that use PID1 to map guest kernel's stid for e500, > > but there is another problem that how to handle guest mapping access > > privilege. > > > > I think we should apart this problem into two to make it clear. > > 1). how to handle guest share mapping(tid=0). > > 2). how to handle guest mapping without user access privilege. > > What I mean is there may be non-zero tid mapping without user access > > privilege, > > Although this seems don't exist in Linux. > > Yeah, I can't think of how this would be used. Maybe some microkernel? I don't > think I want to worry about this case right now... > > > I have two different ways in mind to solve it. > > > > 1. utilize e500 PID1 as you mentioned > > We can set PID1 when enter guest kernel and clear PID1 when enter guest > > userspace. > > > > Pro. > > this method can minimize TLB flush. > > Con. > > it's based on the assumption that all guest access limited mappings > > always have tid=0, > > or non-zero mappings always have user access bits. > > It's fine for Linux but not sure for other OSes. > > Con: > - Only works on e500v1/v2. :) But that's where the immediate need is, and I > don't see a better general-purpose solution, so I think it's OK. > > That reminds me... > > Rahul, how does NetBSD handle copy_to_user() from AS=0, since userspace TLB > mappings have TS=1? (I tried to find source myself, but it looks like the only > port was done by Wasabi Systems, which apparently abruptly closed down > recently.) Does it just fault in a new TLB entry as needed? > Yes for smaller copies < 256 bytes, it simply switches between AS 0 and AS 1 and would fault for a new TLB entry if needed for TS=1.(regular fault path) (I would think the odds for this happen to be low since the TS=1 entry should be present ?) For bigger copies the user pa range(extracted from the user va range) is mapped into the kernel address space (kernel pmap) and then simply memcpy'd > -- > Hollis Blanchard > IBM Linux Technology Center -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html