Hi Tony, > On Behalf Of Tony Lindgren > Sent: Freitag, 16. Februar 2018 17:14 > * Isak Lichtenstein <Isak.Lichtenstein@xxxxxxxxxxx> [180216 13:36]: > > We can imagine the following use cases that could trigger it: > > - Another process stops the timer accidentally by clearing the ST bit of the TCLR > register. > > - Setting the ST bit of the TCLR register after loading the TCRR register goes wrong > ( in omap2_gp_timer_set_next_event() ) e.g. interrupt, posted mode. > > Well there has been lost interrupt related issue for various drivers because of missing > flush of posted writes. I doubt that is the issue here as the in that case the write ST > bit should be set but written too late. > > If you have a test case for this, you could try the following hacks to try to narrow it > down a bit more: > > 1. Set the clockevent timer to continuous mode instead of > oneshot mode in omap2_gp_timer_set_next_event() > > This way if one interrupt is lost, the timer might trigger > again although at n times the programmed length. > > 2. Add a read-back fo the timer register after programming > it to omap2_gp_timer_set_next_event() > > This obviously will have a performance impact, but > might give some clues. Yep, finally we had time to test your second suggestion. Here the patch we applied: --- arch/arm/plat-omap/include/plat/dmtimer.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/arm/plat-omap/include/plat/dmtimer.h b/arch/arm/plat-omap/include/plat/dmtimer.h index dd79f30..b021558 100644 --- a/arch/arm/plat-omap/include/plat/dmtimer.h +++ b/arch/arm/plat-omap/include/plat/dmtimer.h @@ -392,8 +392,13 @@ static inline void __omap_dm_timer_load_start(struct omap_dm_timer *timer, u32 ctrl, unsigned int load, int posted) { + u32 ctrl_reg = 0; __omap_dm_timer_write(timer, OMAP_TIMER_COUNTER_REG, load, posted); __omap_dm_timer_write(timer, OMAP_TIMER_CTRL_REG, ctrl, posted); + + ctrl_reg = __omap_dm_timer_read(timer, OMAP_TIMER_CTRL_REG, posted); + WARN( (!(ctrl_reg & OMAP_TIMER_CTRL_ST)) && (ctrl & OMAP_TIMER_CTRL_ST) && (timer->id == 2), + "__omap_dm_timer_load_start(): Timer 2 not started!!\nload=0x%x ; ctrl=0x%x ; ctrl_reg=0x%x, posted=0x%x\n", load, ctrl, ctrl_reg, posted); } static inline void __omap_dm_timer_int_enable(struct omap_dm_timer *timer, -- 2.7.4 And indeed we do see the warning popping up more times than expected (on some devices every couple of hours), but the gp_timer isn't stopped. Here the output of such a warning: Feb 22 09:45:02 DUT1 user.warn kernel: WARNING: CPU: 0 PID: 0 ....../kernel-source/arch/arm/plat-omap/include/plat/dmtimer.h:403 omap2_gp_timer_set_next_event+0xb4/0xcc Feb 22 09:45:02 DUT1 user.warn kernel: __omap_dm_timer_load_start(): Timer 2 not started!! Feb 22 09:45:02 DUT1 user.warn kernel: [<c012e5a0>] (warn_slowpath_fmt) from [<c011aba8>] (omap2_gp_timer_set_next_event+0xb4/0xcc) Feb 22 09:45:02 DUT1 user.warn kernel: [<c011aba8>] (omap2_gp_timer_set_next_event) from [<c017e8a8>] (clockevents_program_min_delta+0x64/0x70) Feb 22 09:45:02 DUT1 user.warn kernel: [<c017fcd0>] (tick_program_event) from [<c01729a8>] (hrtimer_start_range_ns+0x1f8/0x214) Feb 22 09:45:02 DUT1 user.warn kernel: [<c01729a8>] (hrtimer_start_range_ns) from [<c0180258>] (tick_nohz_restart.constprop.9+0x80/0xa0) Unfortunately not the entire log message is output, guess due to the "\n" in it. Nevertheless it gives a clue, although one I'm not sure what to do with it as it leaves me rather confused. As we read the register using __omap_dm_timer_read, I would assume that the "posted" issue should be mitigated. If so what else can it be? Why would the gp_timer sometimes stop completely while other times it continuous working properly? Thank you for the initial guidance. Best regards Isak -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html