Hi David, On Friday 25 April 2014 10:27:17 David Daney wrote: > On 04/25/2014 08:19 AM, James Hogan wrote: > > Expose the KVM guest CP0_Count bias (from the monotonic kernel time) to > > userland in nanosecond units via a new KVM_REG_MIPS_COUNT_BIAS register > > accessible with the KVM_{GET,SET}_ONE_REG ioctls. This gives userland > > control of the bias so that it can exactly match its own monotonic time. > > > > The nanosecond bias is stored separately from the raw bias used > > internally (since nanoseconds isn't a convenient or efficient unit for > > various timer calculations), and is recalculated each time the raw count > > bias is altered. The raw count bias used in CP0_Count determination is > > recalculated when the nanosecond bias is altered via the KVM_SET_ONE_REG > > ioctl. > > Is this really necessary? That's a good question... > > The architecture has CP0_COUNT. How does the concept of this noew > synthetic bias value interact with the architecture's CP0_COUNT? It's a single bias state under the hood for a running timer. Setting the user_bias effectively results in a guest count of: CP0_Count = (ktime_to_ns(ktime_get()) + user_bias) * count_hz / 1e9 Under the hood it's actually converted user_bias to a 32-bit bias that simplifies the calculations since it wraps more conveniently in places: CP0_Count = bias + ktime_to_ns(ktime_get()) * count_hz / 1e9 Similarly setting the guest CP0_Count to count results in the bias (and user_bias) being recalculated such that CP0_Count at that moment = count: CP0_Count = count (substitute CP0_Count) bias + ktime_to_ns(ktime_get()) * count_hz / 1e9 = count (rearrange) bias = count - ktime_to_ns(ktime_get()) * count_hz / 1e9 > > It seems like by adding this you new have two ways to access and > manipulate the same thing. > > 1) The architecturally specified CP0_COUNT. > 2) This new bias thing. Almost, but not quite. Full control of the timer value without the new bias is a bit more complicated than just writing CP0_COUNT... > > What if we just let userspace directly manipulate the CP0_COUNT, and if > necessary only maintain a bias as an internal implementation detail? The difference is in the ability for userland to recalculate the CP0_Count at any moment for a running timer (e.g. taking advantage of the fact that I believe qemu's qemu_get_clock_ns(rt_clock) = ktime_to_ns(ktime_get()). Setting the Count directly while the timer is running, the ktime_get() part cannot be precisely known to userland. Since I added the COUNT_CTL.DC & COUNT_RESUME registers it can be partly controlled, but only because the timer can be frozen/snapshotted. Userland could set COUNT_CTL.DC=1, read COUNT_RESUME to get the time when the timer was frozen, then set the Count to the desired value at COUNT_RESUME (which would take effect as if the write happened at COUNT_RESUME nanoseconds) and set COUNT_CTL.DC=0 to unfreeze the timer. It could then calculate/know the bias itself, knowing the CP0_Count at a particular time (COUNT_RESUME) with a particular frequency (COUNT_HZ). Arguably it's simpler (and probably faster) to just write the COUNT_BIAS register with a newly calculated or altered bias. These are I believe the strengths of each related interface: (1) The master DC provides atomic access to both CP0_Count and CP0_Cause (both CP0_Cause.DC and CP0_Cause.TI) which isn't provided by other interfaces. (2) The COUNT_RESUME in combination with the master DC provides atomic access to the current monotonic time and CP0_Count (provided implicitly by (4)), but also atomic control of the CP0_Cause at the same time (not provided by other interfaces). (3) Access to the plain CP0_Count provides straightforward access to the CP0_Count of a running or more importantly stopped timer. (4) Access to the bias provides exact control of the value of the timer relative to monotonic time while the timer is running (provided implicitly by (2)), without having to freeze it (not provided elsewhere). So yes, you could technically manage without (4) by using (2) ((4) was implemented first), but I think it probably still has some value since you can do it with a single ioctl rather than 4 ioctls (freeze timer, read resume_time, read or write count, unfreeze timer). Enough value to be worthwhile? I haven't really made up my mind yet but I'm leaning towards yes. Do you have any further thoughts on that? Thanks James
Attachment:
signature.asc
Description: This is a digitally signed message part.