[RESEND PATCH] timerfd: Allow TFD_TIMER_CANCEL_ON_SET with relative timeouts

Jesper Nilsson <jesper.nilsson@xxxxxxxx> · Fri, 9 Oct 2015 10:25:14 +0200

Allow TFD_TIMER_CANCEL_ON_SET on timerfd_settime() with relative
as well as absolute timeout.

Signed-off-by: Jesper Nilsson <jesper.nilsson@xxxxxxxx>
---
Resending after some discussion with Thomas Gleixner at ELCE,
and Cc:ing John Stultz and Michael Kerrisk who may have comments.

Longer background:

One of the uses for TFD_TIMER_CANCEL_ON_SET is to get
an event when the CLOCK_REALTIME changes (as by NTP or user action).
In this case, the timeout irrelevant, and the maximum
available value would be selected to avoid mis-triggers.

However, timerfd uses time_t for configuration, and the maximum
value on a 32bit time_t system is actually a valid time
(near 2038-01-19 03:14) in the 64bit ktime_t used internally in timerfd.

One way of provoking this problem would be to set the time
using "date '2038-01-19 03:14'" and letting the time roll over
a few seconds later.

After this time, a timerfd will continuously fire
when configured with a maximum absolute timeout,
potentially stealing all CPU and stopping the application
from doing what it really should be doing.
Which would be fine, unless the application is systemd
and loops at startup, leaving the system in a state where
the kernel is up, but nothing running in userspace. :-(

This problem was further exposed in kernel v3.19 by
commit a6d6e1c879efa4b77e250c34fe5fe1c34e6ef070
which introduced 64bit time in the RTC subsystem.
On an unconfigured RTC or an RTC with flat/removed battery
the date on could be random, and in some cases past 2038.

Of course, the proposed patch only allows the setting of relative
timeouts with TFD_TIMER_CANCEL_ON_SET, any application using
it would also need to be patched to use the relative timer
for this solve the described problem.

Another solution would be to add a new flag to timerfd_settime()
to indicate that the timer value is irrelevant, but I considered
it an unnecessary waste of a flag-bit.

A third possible solution is to steal the time (0,0)
(which currently gives -EINVAL) to indicate that the timeout
is irrelevant. This would however be a change in behaviour
that a current user wouldn't expect, perhaps making a previously
correctly failing application wait until a time change (or forever)

Tested on a MIPS 34Kc with kernel v4.1.

 fs/timerfd.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/timerfd.c b/fs/timerfd.c
index b94fa6c..8ec3aeb 100644
--- a/fs/timerfd.c
+++ b/fs/timerfd.c
@@ -134,7 +134,7 @@ static void timerfd_setup_cancel(struct timerfd_ctx *ctx, int flags)
 {
 	if ((ctx->clockid == CLOCK_REALTIME ||
 	     ctx->clockid == CLOCK_REALTIME_ALARM) &&
-	    (flags & TFD_TIMER_ABSTIME) && (flags & TFD_TIMER_CANCEL_ON_SET)) {
+	    (flags & TFD_TIMER_CANCEL_ON_SET)) {
 		if (!ctx->might_cancel) {
 			ctx->might_cancel = true;
 			spin_lock(&cancel_lock);
-- 
1.7.10.4


/^JN - Jesper Nilsson
-- 
               Jesper Nilsson -- jesper.nilsson@xxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html