Re: [PATCH 14/21] MIPS: KVM: Add nanosecond count bias KVM register

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi David,

On Friday 25 April 2014 10:27:17 David Daney wrote:
> On 04/25/2014 08:19 AM, James Hogan wrote:
> > Expose the KVM guest CP0_Count bias (from the monotonic kernel time) to
> > userland in nanosecond units via a new KVM_REG_MIPS_COUNT_BIAS register
> > accessible with the KVM_{GET,SET}_ONE_REG ioctls. This gives userland
> > control of the bias so that it can exactly match its own monotonic time.
> > 
> > The nanosecond bias is stored separately from the raw bias used
> > internally (since nanoseconds isn't a convenient or efficient unit for
> > various timer calculations), and is recalculated each time the raw count
> > bias is altered. The raw count bias used in CP0_Count determination is
> > recalculated when the nanosecond bias is altered via the KVM_SET_ONE_REG
> > ioctl.
> 
> Is this really necessary?

That's a good question...

> 
> The architecture has CP0_COUNT.  How does the concept of this noew
> synthetic bias value interact with the architecture's CP0_COUNT?

It's a single bias state under the hood for a running timer.

Setting the user_bias effectively results in a guest count of:
CP0_Count = (ktime_to_ns(ktime_get()) + user_bias) * count_hz / 1e9

Under the hood it's actually converted user_bias to a 32-bit bias that 
simplifies the calculations since it wraps more conveniently in places:
CP0_Count = bias + ktime_to_ns(ktime_get()) * count_hz / 1e9

Similarly setting the guest CP0_Count to count results in the bias (and 
user_bias) being recalculated such that CP0_Count at that moment = count:
CP0_Count = count
(substitute CP0_Count)
bias + ktime_to_ns(ktime_get()) * count_hz / 1e9 = count
(rearrange)
bias = count - ktime_to_ns(ktime_get()) * count_hz / 1e9

> 
> It seems like by adding this you new have two ways to access and
> manipulate the same thing.
> 
> 1) The architecturally specified CP0_COUNT.
> 2) This new bias thing.

Almost, but not quite. Full control of the timer value without the new bias is 
a bit more complicated than just writing CP0_COUNT...

> 
> What if we just let userspace directly manipulate the CP0_COUNT, and if
> necessary only maintain a bias as an internal implementation detail?

The difference is in the ability for userland to recalculate the CP0_Count at 
any moment for a running timer (e.g. taking advantage of the fact that I 
believe qemu's qemu_get_clock_ns(rt_clock) = ktime_to_ns(ktime_get()).

Setting the Count directly while the timer is running, the ktime_get() part 
cannot be precisely known to userland.

Since I added the COUNT_CTL.DC & COUNT_RESUME registers it can be partly 
controlled, but only because the timer can be frozen/snapshotted. Userland 
could set COUNT_CTL.DC=1, read COUNT_RESUME to get the time when the timer was 
frozen, then set the Count to the desired value at COUNT_RESUME (which would 
take effect as if the write happened at COUNT_RESUME nanoseconds) and set 
COUNT_CTL.DC=0 to unfreeze the timer. It could then calculate/know the bias 
itself, knowing the CP0_Count at a particular time (COUNT_RESUME) with a 
particular frequency (COUNT_HZ).
Arguably it's simpler (and probably faster) to just write the COUNT_BIAS 
register with a newly calculated or altered bias.

These are I believe the strengths of each related interface:
(1) The master DC provides atomic access to both CP0_Count and CP0_Cause (both 
CP0_Cause.DC and CP0_Cause.TI) which isn't provided by other interfaces.
(2) The COUNT_RESUME in combination with the master DC provides atomic access 
to the current monotonic time and CP0_Count (provided implicitly by (4)), but 
also atomic control of the CP0_Cause at the same time (not provided by other 
interfaces).
(3) Access to the plain CP0_Count provides straightforward access to the 
CP0_Count of a running or more importantly stopped timer.
(4) Access to the bias provides exact control of the value of the timer 
relative to monotonic time while the timer is running (provided implicitly by 
(2)), without having to freeze it (not provided elsewhere).

So yes, you could technically manage without (4) by using (2) ((4) was 
implemented first), but I think it probably still has some value since you can 
do it with a single ioctl rather than 4 ioctls (freeze timer, read 
resume_time, read or write count, unfreeze timer).

Enough value to be worthwhile? I haven't really made up my mind yet but I'm 
leaning towards yes.

Do you have any further thoughts on that?

Thanks
James

Attachment: signature.asc
Description: This is a digitally signed message part.


[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux