Re: [PATCH 1/9] KVM: PPC: Book3S: Add infrastructure to implement kernel-side RTAS calls

Alexander Graf <agraf@xxxxxxx> · Fri, 19 Apr 2013 17:16:46 +0200

On 05.04.2013, at 00:38, Paul Mackerras wrote:

> On Thu, Apr 04, 2013 at 11:49:55AM +0200, Alexander Graf wrote:
>> 
>> On 04.04.2013, at 07:37, Paul Mackerras wrote:
>> 
>>> On Thu, Mar 21, 2013 at 09:52:16AM +0100, Alexander Graf wrote:
>>>>> +/* Platform specific hcalls, used by KVM */
>>>>> +#define H_RTAS			0xf000
>>>> 
>>>> How about you define a different hcall ID for this? Then QEMU would
>>>> create its "rtas entry blob" such that KVM-routed RTAS handling goes
>>>> to KVM directly.
>>> 
>>> QEMU can still do that, and I don't see that it would change the
>>> kernel side if it did.  We would still have to have agreement between
>>> the kernel and userspace as to what the hcall number for invoking the
>>> in-kernel RTAS calls was, and the kernel would still have to keep a
>>> list of token numbers and how they correspond to the functions it
>>> provides.  The only thing different would be that the in-kernel RTAS
>>> hcall could return to the guest if it didn't recognize the token
>>> number, rather than pushing the problem up to userspace.  However,
>>> that wouldn't make the code any simpler, and it isn't a situation
>>> where performance is an issue.
>>> 
>>> Do you see some kernel-side improvements or simplifications from your
>>> suggestion that I'm missing?  Remember, the guest gets the token
>>> numbers from the device tree (properties under the /rtas node), so
>>> they are under the control of userspace/QEMU.
>> 
>> The code flow with this patch:
>> 
>>  <setup time>
>> 
>>  foreach (override in overrides)
>>    ioctl(OVERRIDE_RTAS, ...);
>> 
>>  <runtime>
>> 
>>  switch (hcall_id) {
>>  case QEMU_RTAS_ID:
>>    foreach (override in kvm_overrides) {
>>      int rtas_id = ...;
>>      if (override.rtas_id == rtas_id) {
>>        handle_rtas();
> 
> Actually this is more like: override.handler();
> 
>>        handled = true;
>>      }
>>    }
>>    if (!handled)
>>      pass_to_qemu();
>>    break;
>>  default:
>>    pass_to_qemu();
>>    break
>>  }
>> 
>> What I'm suggesting:
>> 
>>  <setup time>
>> 
>>  nothing from KVM's point of view
> 
> Actually, this can't be "nothing".
> 
> The way the RTAS calls work is that there is a name and a "token"
> (32-bit integer value) for each RTAS call.  The tokens have to be
> unique for each different name.  Userspace puts the names and tokens
> in the device tree under the /rtas node (a set of properties where the
> property name is the RTAS function name and the property value is the
> token).  The guest looks up the token for each RTAS function it wants
> to use, and passes the token in the argument buffer for the RTAS call.
> 
> This means that userspace has to know the names and tokens for all
> supported RTAS functions, both the ones implemented in the kernel and
> the ones implemented in userspace.
> 
> Also, the token numbers are pretty arbitrary, and the token numbers
> for the kernel-implemented RTAS functions could be chosen by userspace
> or by the kernel.  If they're chosen by the kernel, then userspace
> needs a way to discover them (so it can put them in the device tree),
> and also has to avoid choosing any token numbers for its functions
> that collide with a kernel-chosen token.  If userspace chooses the
> token numbers, it has to tell the kernel what token numbers it has
> chosen for the kernel-implemented RTAS functions.  We chose the latter
> since it gives userspace more control.
> 
> So this <setup time> code has to be either (your suggestion):
> 
>    foreach RTAS function possibly implemented in kernel {
>        query kernel token for function, by name
> 	if that gives an error, mark function as needing to be
> 	        implemented in userspace
>    }
>    (userspace) allocate tokens for remaining functions,
>                avoiding collisions with kernel-chosen tokens
> 
> or else it is (my suggestion):
> 
>    (userspace) allocate tokens for all RTAS functions
>    foreach RTAS function possibly implemented in kernel {
>        tell kernel the (name, token) correspondence
>    }
> 
>>  <runtime>
>> 
>>  switch (hcall_id) {
>>  case KVM_RTAS_ID:
>>    handle_rtas();
> 
> Here, you've compressed details that you expanded in your pseudo-code
> above, making this a less than fair comparison.  This handle_rtas()
> function has to fetch the token and branch out to the appropriate
> handler routine.  Whether that's a switch statement or a loop over
> registered handlers doesn't make all that much difference.
> 
>>    break;
>>  default:
>>    pass_to_qemu();
>>    break;
>>  }
>> 
>> 
>> Which one looks easier and less error prone to you? :)
>> 
>> Speaking of which, how does user space know that the kernel actually
>> supports a specific RTAS token? 
> 
> It's really the names that are more important, the tokens are pretty
> arbitrary.  In my scheme, userspace does a KVM_PPC_RTAS_DEFINE_TOKEN
> ioctl giving the name and the (userspace-chosen) token, which gets an
> error if the kernel doesn't recognize the name.  In your scheme, there
> would have to be an equivalent ioctl to query the (kernel-chosen)
> token for a given name, which once again would return an error if the
> kernel doesn't recognize the name.  Either way the kernel has to have
> a list of names that it knows about.

Hrm. I think I'm slowly grasping what the real issue is.

Fair enough, your approach works for me then.

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html