On Mon, Oct 21, 2019 at 10:48 AM Alexandre Belloni <alexandre.belloni@xxxxxxxxxxx> wrote: > On 21/10/2019 10:20:08-0700, Brian Norris wrote: > > Hi Alexandre! > > > > On Mon, Oct 21, 2019 at 9:11 AM Alexandre Belloni > > <alexandre.belloni@xxxxxxxxxxx> wrote: > > > On 21/05/2018 09:42:22-0700, Brian Norris wrote: > > > > __rtc_read_time() can fail (e.g., if the RTC uses an unreliable medium). > > > > When it does, we don't report the error, but instead calculate a > > > > 1-second alarm based on the potentially-garbage 'tm' (in practice, > > > > __rtc_read_time() zeroes out the time first, so it's likely to still be > > > > all 0). > > > > > > > > Let's propagate the error instead. > > > > > > > > > > I submitted > > > https://lore.kernel.org/linux-rtc/20191021155631.3342-2-alexandre.belloni@xxxxxxxxxxx/T/#u > > > to solve this after looking at all the implication checking the return > > > value of __rtc_read_time had. > > > > Only about a year and a half late, nice! > > I know, right? :) The fact is that this is not a common issue or at > least, I didn't have any report that this was causing real problems in > the field. So it ended up being one of the longest standing patch in > patchwork... I suppose I could have put specific examples in here: the Rockchip RK3399-based Gru family of Chromebooks (arch/arm64/boot/dts/rockchip/rk3399-gru-{kevin,bob,scarlet}*.dts) use the Chrome OS EC-based RTC (drivers/rtc/rtc-cros-ec.c), and the EC protocol is prone to occasional errors. So we definitely have a concrete case where this problem can be tickled. I guess I was too terse in summarizing that as "if the RTC uses an unreliable medium"? As for the actual symptoms: this was part of a variety of problems that resulted in seeing interrupt storms from our EC/RTC when running `hwclock -r`. I believe there were other patches that were more critical to resolving the worst symptoms, but this error was noticed along the way. If you care to read more, you can see our downstream kernel patches here, when we first handled this problem: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1067442 https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1066984 https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1069546 Unfortunately, the bug links are private (they were dealing with partner/factory issues), so you can only glean the implicit information from the code. And since this was over a year ago, my memory is a little fuzzy on what exactly the source of the interrupt storm was... > >Fortunately we have a decent > > (albeit time-consuming) process for rebasing our downstream patches in > > Chrome OS kernels... > > > > Do you need that patch backported to LTS kernels? Eh, I dunno. If anything that'll just cause us merge troubles (but not too much) on our Chrome OS kernels, which already carry my patch. But if there are any non-Chrome-OS users of these Chromebooks (there are) that are seeing this problem (I'm not sure), they might appreciate it. By the way, I wonder if your patch actually deserves a "Reported-by". I suppose I also left off Jeffy as the reporter, but it would be: Reported-by: Jeffy Chen <jeffy.chen@xxxxxxxxxxxxxx> Reported-by: Brian Norris <briannorris@xxxxxxxxxxxx> Brian