On 04/07/2010 09:05 AM, Wu Zhangin wrote:
From: Wu Zhangjin<wuzhangjin@xxxxxxxxx> Changes: v1 -> v2: o change the old mips_sched_clock() to mips_cyc2ns() and modify the arguments to support 32bit. o add 32bit support: use a smaller shift to avoid the quick overflow of 64bit arithmatic and balance the overhead of the 128bit arithmatic and the precision lost with the smaller shift. ---------------------- Because the high resolution sched_clock() for r4k has the same overflow problem and solution mentioned in "MIPS: Octeon: Use non-overflowing arithmetic in sched_clock". "With typical mult and shift values, the calculation for Octeon's sched_clock overflows when using 64-bit arithmetic. Use 128-bit calculations instead." To reduce the duplication, This patch abstracts the solution into an inline funciton mips_cyc2ns() into arch/mips/include/asm/time.h from arch/mips/cavium-octeon/csrc-octeon.c. Two patches for Cavium and R4K will be sent out respectively to use this common function. Signed-off-by: Wu Zhangjin<wuzhangjin@xxxxxxxxx> --- arch/mips/include/asm/time.h | 38 ++++++++++++++++++++++++++++++++++++++ 1 files changed, 38 insertions(+), 0 deletions(-) diff --git a/arch/mips/include/asm/time.h b/arch/mips/include/asm/time.h index c7f1bfe..898f0e0 100644 --- a/arch/mips/include/asm/time.h +++ b/arch/mips/include/asm/time.h @@ -96,4 +96,42 @@ static inline void clockevent_set_clock(struct clock_event_device *cd, clockevents_calc_mult_shift(cd, clock, 4); } +static inline unsigned long long mips_cyc2ns(u64 cyc, u64 mult, u64 shift) +{ +#ifdef CONFIG_32BIT + /* + * To balance the overhead of 128bit-arithematic and the precision + * lost, we choose a smaller shift to avoid the quick overflow as the + * X86& ARM does. please refer to arch/x86/kernel/tsc.c and + * arch/arm/plat-orion/time.c + */ + return (cyc * mult)>> shift; +#else /* CONFIG_64BIT */ + /* 64-bit arithmatic can overflow, so use 128-bit. */ +#if (__GNUC__< 4) || ((__GNUC__ == 4)&& (__GNUC_MINOR__<= 3)) + u64 t1, t2, t3; + unsigned long long rv; + + asm ( + "dmultu\t%[cyc],%[mult]\n\t" + "nor\t%[t1],$0,%[shift]\n\t" + "mfhi\t%[t2]\n\t" + "mflo\t%[t3]\n\t" + "dsll\t%[t2],%[t2],1\n\t" + "dsrlv\t%[rv],%[t3],%[shift]\n\t" + "dsllv\t%[t1],%[t2],%[t1]\n\t" + "or\t%[rv],%[t1],%[rv]\n\t" + : [rv] "=&r" (rv), [t1] "=&r" (t1), [t2] "=&r" (t2), [t3] "=&r" (t3) + : [cyc] "r" (cyc), [mult] "r" (mult), [shift] "r" (shift) + : "hi", "lo"); + return rv; +#else /* GCC> 4.3 do it the easy way. */ + unsigned int __attribute__((mode(TI))) t = cyc; + + t = (t * mult)>> shift; + return (unsigned long long)t; +#endif +#endif /* CONFIG_64BIT */ +} + #endif /* _ASM_TIME_H */
It turns out that all GCC versions can handle the inline asm way. It has also been noted that the default Debian compiler somehow has problems with the 'easy way'.
Therefore, I would recommend gitting rid of the GCC version conditionals and just leave the inline asm.
David Daney