On Wed, Jan 09, 2008 at 05:08:36AM -0800, David Miller wrote: > From: Martin Habets <errandir_news@xxxxxxxxxxxxxxxxx> > Date: Sun, 6 Jan 2008 22:40:13 +0000 > > > The microseconds are determined by: > > (xtime.tv_nsec / 1000) + (l10_counter >> 10) > > I dumped both variables with a simple kernel module (first 2 attachments), > > which shows that xtime is monotonous, but l10_counter looks random to me. > > > > I do not know the problem here, and cannot find details on the operation > > of this counter property on sun4m. Maybe this l10_counter needs to be > > callibrated? > > I understand from include/asm-sparc/timer.h that l10_counter should count > > down on sun4m, which makes the current code all the more puzzling. > > Actually, the l10_counter counts up in microseconds. When the > l10_counter equals l10_limit an interrupt is generated. Glad it counts up not down. The code makes a lot more sense that way! It looks to count in nanoseconds, given the number of overlows I'm seeing. > Those values that look "random" to you have bit 31 set for some > reason. Thanks, I didn't notice that. > Clear that bit and the values look much more sane. > > I couldn't find my old STP1040 et al manuals so I took a look > at the OpenBSD sparc code. It masks the shifted value with > 0x1fffff and has a note in one of it's header files mentioning > that bit 0x80000000 in the counter register means the counter > has hit the limit and the interrupt hasn't been cleared yet > (which is done by reading the limit register). > > We are masking like that so this aspect is fine. > > But it is that case with the 0x80000000 bit being set that > is causing the problems. > > It means an interrupt is pending and the counter wraps back > down to the beginning and starts to count from zero again. > When the interrupt is serviced, xtime would get advanced > forward. > > We need to integrate that pending interrupt event into > the calculations. > > Please try this patch: It almost works. Reaching the l10 limit does not indicate a second overflow, but rather 10 milliseconds. With the patch below I cannot make it fail on UP systems. Output on UP looks like: # insmod mod.ko l10_limit = 10241024 10001 tick_nsec = 10001268 l10_counter shr(10) limit? xtime gettimeofday 10203648 9964 1199930413.956184 1199930413.966145 10218496 9979 1199930413.956184 1199930413.966161 10231808 9992 1199930413.956184 1199930413.966174 2147487744 4 yes 1199930413.956184 1199930413.966187 2147501056 17 yes 1199930413.956184 1199930413.966200 48640 47 1199930413.966185 1199930413.966231 61952 60 1199930413.966185 1199930413.966244 75264 73 1199930413.966185 1199930413.966257 88576 86 1199930413.966185 1199930413.966270 I can still make it fail on SMP, gettimeofday is always off by 10 milliseconds. do_gettimeofday() uses read_seqbegin_irqsave(). This blocks the l10 interrupt from being taken on the local CPU, but I think it will just get taken on the other CPU. In essence xtime and l10_counter are not guaranteed to be consistent on all CPUs. I could reduce the number of failures by dropping the _irqsave/restore. With that the interrupt can be taken on the local CPU, and do_gettimeofday() just goes through the while loop one more time. But what is the best way to realy solve this? Thanks, Martin Index: 2.6/arch/sparc/kernel/time.c =================================================================== --- 2.6.orig/arch/sparc/kernel/time.c 2008-01-09 21:21:05.000000000 +0000 +++ 2.6/arch/sparc/kernel/time.c 2008-01-10 00:55:28.000000000 +0000 @@ -437,7 +437,14 @@ static inline unsigned long do_gettimeoffset(void) { - return (*master_l10_counter >> 10) & 0x1fffff; + unsigned long val = *master_l10_counter; + unsigned long usec = (val >> 10) & 0x1fffff; + + /* Limit hit? */ + if (val & 0x80000000) + usec += TICK_SIZE; + + return usec; } /* Ok, my cute asm atomicity trick doesn't work anymore. - To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html