On Thu, Dec 03 2020 at 12:16, Jason Gunthorpe wrote: > On Thu, Dec 03, 2020 at 04:39:21PM +0100, Thomas Gleixner wrote: > >> The logic in sync_cmos_clock() and rtc_set_ntp_time() is different as I >> pointed out: sync_cmos_clock() hands -500ms to rtc_tv_nsec_ok() and >> rtc_set_ntp_time() uses +500ms, IOW exactly ONE second difference in >> behaviour. > > I understood this is because the two APIs work differently, rmk > explained this as: > >> 1. kernel/time/ntp.c assumes that all RTCs want to be told to set the >> time at around 500ms into the second. >> >> 2. drivers/rtc/systohc.c assumes that if the time being set is >= 500ms, >> then we want to set the _next_ second. > > ie one path is supposed to round down and one path is supposed to > round up, so you get to that 1s difference.. > > IIRC this is also connected to why the offset is signed.. The problem is that it is device specific and therefore having the offset parameter is a good starting point. Lets look at the two scenarios: 1) Direct accessible RTC: tsched t1 t2 write(newsec) RTC increments seconds For rtc_cmos/MC1... tinc = t2 - t1 = 500ms There are RTCs which reset the thing on write so tinc = t2 - t1 = 1000ms No idea what other variants are out there, but the principle is the same for all of them. Lets assume that the event is accurate for now and ignore the fuzz logic, i.e. tsched == t1 tsched must be scheduled to happen tinc before wallclock increments seconds so that the RTC increments seconds at the same time. That means newsec = t1.tv_sec. So now the fuzz logic for the legacy cmos path does: newtime = t1 - tinc; if (newtime.tv_nsec < FUZZ) newsec = newtime.tv_sec; else if (newtime.tv_nsec > NSEC_PER_SEC - FUZZ) newsec = newtime.tv_sec + 1; else goto fail; The first condition handles the case where t1 >= tsched and the second one where t1 < tsched. We need the same logic for rtc_cmos() when the update goes through the RTC path, which is broken today. See below. 2) I2C/SPI ... tsched t0 t1 t2 transfer(newsec) RTC update (newsec) RTC increments seconds Lets assume that ttransfer = t1 - t0 is known. tinc is the same as above = t2 - t1 Again, lets assume that the event is accurate for now and ignore the fuzz logic, i.e. tsched == t0 So tsched has to be ttot = t2 - t0 _before_ wallclock reaches t2 and increments seconds. In this case newsec = t1.tv_sec = (t0 + ttransfer).tv_sec So now the fuzz logic for this is: newtime = t0 + ttransfer; if (newtime.tv_nsec < FUZZ) newsec = newtime.tv_sec; else if (newtime.tv_nsec > NSEC_PER_SEC - FUZZ) newsec = newtime.tv_sec + 1; else goto fail; Again the first condition handles the case where t1 >= tsched and the second one where t1 < tsched. So now we have two options to fix this: 1) Use a negative sync_offset for devices which need #1 above (rtc_cmos & similar) That requires setting tsched to t2 - abs(sync_offset) 2) Use always a positive sync_offset and a flag which tells rtc_tv_nsec_ok() whether it needs to add or subtract. #1 is good enough. All it takes is a comment at the timer start code why abs() is required. Let me hack that up along with the hrtimer muck. Thanks, tglx