Joe Perches <joe@xxxxxxxxxxx> writes: > On Fri, 2015-09-04 at 18:00 -0700, John Stultz wrote: >> On Fri, Sep 4, 2015 at 5:57 PM, John Stultz <john.stultz@xxxxxxxxxx> wrote: >> > On Thu, Sep 3, 2015 at 4:26 AM, Miroslav Lichvar <mlichvar@xxxxxxxxxx> wrote: >> >> On Wed, Sep 02, 2015 at 04:16:00PM -0700, John Stultz wrote: >> >>> On Tue, Sep 1, 2015 at 6:14 PM, Nuno Gonçalves <nunojpg@xxxxxxxxx> wrote: >> >>> > And just installing chrony from the feeds. With any kernel from 3.17 >> >>> > you'll have wrong estimates at chronyc sourcestats. >> >>> >> >>> Wrong estimates? Could you be more specific about what the failure >> >>> you're seeing is here? The >> >>> >> >>> I installed the image above, which comes with a 4.1.6 kernel, and >> >>> chrony seems to have gotten my BBB into ~1ms sync w/ servers over the >> >>> internet fairly quickly (at least according to chronyc tracking). >> >> >> >> To see the bug with chronyd the initial offset shouldn't be very close >> >> to zero, so it's forced to correct the offset by adjusting the >> >> frequency in a larger step. >> >> >> >> I'm attaching a simple C program that prints the frequency offset >> >> as measured between the REALTIME and MONOTONIC_RAW clocks when the >> >> adjtimex tick is set to 9000. It should show values close to -100000 >> >> ppm and I suspect on the BBB it will be much smaller. >> > >> > So I spent some time on this late last night and this afternoon. >> > >> > It was a little odd because things don't seem totally broken, but >> > something isn't quite right. >> > >> > Digging around it seems the iterative logrithmic approximation done in >> > timekeeping_freqadjust() wasn't working right. Instead of making >> > smaller order alternating positive and negative adjustments, it was >> > doing strange growing adjustments for the same value that wern't large >> > enough to actually correct things very quickly. This made it much >> > slower to adapt to specified frequency values. >> > >> > The odd bit, is it seems to come down to: >> > tick_error = abs(tick_error); >> > >> > Haven't chased down why yet, but apparently abs() isn't doing what one >> > would think when passed a s64 value. >> >> Well.. chasing it down wasn't hard.. from include/linux/kernel.h: >> /* >> * abs() handles unsigned and signed longs, ints, shorts and chars. For all >> * input types abs() returns a signed long. >> * abs() should not be used for 64-bit types (s64, u64, long long) - use abs64() >> * for those. >> */ >> >> Ouch. > > Here's a little cocci script that finds more of these in: Thanks. Maybe we should also: diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 5582410727cb..aa7d69afdcac 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -208,6 +208,7 @@ extern int _cond_resched(void); */ #define abs(x) ({ \ long ret; \ + BUILD_BUG_ON(sizeof(x) > sizeof(long)); \ if (sizeof(x) == sizeof(long)) { \ long __x = (x); \ ret = (__x < 0) ? -__x : __x; \ so that people won't make the same mistake again. That finds bugs in driver/md/raid10.c drivers/gpu/drm/radeon/radeon_display.c kernel/time/clocksource.c kernel/time/timekeeping.c fs/ext4/mballoc.c that your cocci scripted missed. All "abs(x - y)". As sector_t can be 32bit and can be 64bit, I wonder if abs_sector() would be a good idea ... probably not. Thoughts? NeilBrown
Attachment:
signature.asc
Description: PGP signature