On Mon, Oct 31, 2022 at 11:14:23PM +0100, Alexandre Belloni wrote: > On 31/10/2022 11:19:13-0700, Guenter Roeck wrote: > > On Mon, Oct 31, 2022 at 06:10:53PM +0100, Alexandre Belloni wrote: > > > Hello, > > > > > > On 28/10/2022 17:54:00-0700, Guenter Roeck wrote: > > > > RTC chips on some older Chromebooks can only handle alarms less than 24 > > > > hours in the future. Attempts to set an alarm beyond that range fails. > > > > The most severe impact of this limitation is that suspend requests fail > > > > if alarmtimer_suspend() tries to set an alarm for more than 24 hours > > > > in the future. > > > > > > > > Try to set the real-time alarm to just below 24 hours if setting it to > > > > a larger value fails to work around the problem. While not perfect, it > > > > is better than just failing the call. A similar workaround is already > > > > implemented in the rtc-tps6586x driver. > > > > > > I'm not super convinced this is actually better than failing the call > > > because your are implementing policy in the driver which is bad from a > > > user point of view. It would be way better to return -ERANGE and let > > > userspace select a better alarm time. > > > > The failing call is from alarmtimer_suspend() which is called during suspend. > > It is not from userspace, and userspace has no chance to intervene. > > > > It is also not just one userspace application which could request a large > > timeout, it is a variety of userspace applications, and not all of them are > > written by Google. Some are Android applications. I don't see how it would be > > realistic to expect all such applications to fix their code (if that is even > > possible - there might be an application which called sleep(100000) or > > something equivalent, which works just fine as long as the system is not > > suspended. > > > > > Do you have to know in advance which are the "older" chromebooks that > > > are affected? > > > > Not sure I understand the question. Technically we know, but the cros_ec > > rtc driver doesn't know because the EC doesn't have an API to report the > > maximum timeout to the Linux driver. Even if that existed, it would not > > help because the rtc API only supports absolute maximum clock values, > > not clock offsets relative to the current time. So ultimately there is no > > means for an RTC driver to tell the maximum possible alarm timer offset to > > the RTC subsystem, and there is no means for a user such as > > alarmtimer_suspend() to obtain the maximum time offset. Does that answer > > your question ? > > Yes, my question was missing a few words, sorry I wanted to know if you > had *a way* to know. > See below. It is doable, but there is no real good solution, or at least I don't see one right now. > > > > On a side note, I tried an alternate implementation by adding a retry into > > alarmtimer_suspend(), where it would request a smaller timeout if the > > requested timeout failed. I did not pursue/submit this since it seemed > > hacky. To solve that problem, I'd rather discuss extending the RTC API > > to provide a maximum offset to its users. Such a solution would probably > > be desirable, but that it more longer term and would not solve the > > immediate problem. > > Yes, this is what I was aiming for. This is something that is indeed > missing in the RTC API and that I already thought about. But indeed, it > would be great to have a way to set the alarm range separately from the > time keeping range. This would indeed have to be a range relative to the > current time. > > alarmtimer_suspend() can then get the allowed alarm range for the RTC, > and set the alarm to max(alarm range, timer value) and loop until the > timer has expired. Once we have this API, userspace can do the same. > > I guess that ultimately, this doesn't help your driver unless you are > wanting to wakeup all the chromebooks at least once a day regardless of > their EC. That is a no-go. It would reduce battery lifetime on all Chromebooks, including those not affected by the problem (that is, almost all of them). To implement reporting the maximum supported offset, I'd probably either try to identify affected Chromebooks using devicetree information, or by sending am alarm request > 24h in the future in the probe function and setting the maximum offset just below 24h if that request fails. We'd have to discuss the best approach internally. Either case, that doesn't help with the short term problem that we have to solve now and that can be backported to older kernels. It also won't help userspace - userspace alarm requests, as Brian has pointed out, are separate from limits supported by the RTC hardware. We can not change the API for CLOCK_xxx_ALARM to userspace, and doing so would not make sense anyway since it works just fine as long as the system isn't suspended. Besides, changing alarmtimer_suspend() as you suggest above would solve the problem for userspace, so I don't see a need for a userspace API/ABI change unless I am missing something. Thanks, Guenter