Re: About the changes in co_timer_ack() function of time.c.

Ralf Baechle <ralf@xxxxxxxxxxxxxx> · Tue, 30 Oct 2007 12:35:22 +0000

On Sat, Oct 27, 2007 at 11:06:34AM -0700, Kevin D. Kissell wrote:

> The difference is that, in the case where we are *way* behind in interrupt
> processing, such that the Count value has gone beyond the to the next tick 
> interrupt value, the 2.6.10 code will only try to catch up by a single inteval,
> which may result in having to wait 4 billion cycles for the Count to wrap.
> The 2.6.23.1 version (a) repeats until the programmed Compare value is
> ahead of Count, and (b)  resamples the count register value each time 
> through the loop, which is important if other interrupts may be enabled
> while c0_timer_ack() is running, [...]

The code upto 2.6.23 uses IRQF_DISABLED (which used to be named SA_INTERRUPT
until July 2006) for the timer interrupt in timer_irqaction which is defined
in the generic time.c.

> [...] I could imagine that making a material difference in the presnece
> of "interrupt storms" from I/O devices.

I don't recall any reports of this sort of behaviour before tnishioka's.
This includes no hang reports about the R4000 read from c0_count ever -
because Linux will happen to just nicely tiptoe around the issue for all
real world configurations.

> If I wanted to be pendantic, I would argue that the 2.6.23 is still vulnerable
> to the Count register passing the Compare target between the "if" and the
> write_c0_compare(), and that it would be more airtight to code it more
> like:
>             expirelo = read_c0_count();
>             do {
>                 expirelo += cycles_per_jiffy;
>                 write_c0_compare(expirelo);
>             } while (((read_c0_count()) - expirelo < 0x7fffffff);
> 
> 
> It may well be that the initial value of expirelo should be derived
> from read_c0_compare() and not read_c0_count().  That would
> preserve synchronization of clock ticks against external wall-clock time,
> though the removal of the "slop" would mean that there would be
> slighly more interrupt service events per unit of real time.

I think it should be based on the last compare value.  This is the only
way to ensure interrupts will use a precise timing.

> But I gave up tilting at these windmills a long, long time ago... ;o)

Your windmill has been fixed for 2.6.24.

Now available at your nearest LMO (TM) GIT Store!

The big change is that the new timer code now has a proper concept of
oneshot interrupt timers.  Which is what the compare interrupt really is
despite the continuously running counter.  So this is how the new event
programming code looks like:

static int mips_next_event(unsigned long delta,
                           struct clock_event_device *evt)
{
	unsigned int cnt;
	int res;

	cnt = read_c0_count();
	cnt += delta;
	write_c0_compare(cnt);

	return ((int)(read_c0_count() - cnt) > 0) ? -ETIME : 0;
}

The called will check for the error return and if so invoke the active
clockevent device's ->set_next_event method again with a new suitable
delta.

Qemu btw. can trigger the -ETIME case fairly easily.

But anyway, I don't object a patch to improve theoretical correctness.

  Ralf