Here's a patch that fixes a timing bug in tailf. To tickle the bug,
generate some data, and write it to a file like this:
$ perl -MTime::HiRes -MIO::Handle -e 'while(1) { $i++; Time::HiRes::usleep( 30000); print "abcde $i "; flush STDOUT; print "fghijklmnopqrstuvwxyz $i\n"; flush STDOUT; }' > /tmp/log
In another shell, read the log file with tailf to get something like
$ tailf /tmp/log
abcde 70 fghijklmnopqrstuvwxyz 70
abcde 71 fghijklmnopqrstuvwxyz 71
abcde 72 fghijklmnopqrstuvwxyz 72
abcde 73 fghijklmnopqrstuvwxyz 73
abcde 74 fghijklmnopqrstuvwxyz 74
abcde 75 fghijklmnopqrstuvwxyz 75
abcde 76 fghijklmnopqrstuvwxyz 76
...
abcde 117 fghijklmnopqrstuvwxyz 117
abcde 118 fghijklmnopqrstuvwxyz 118
abcde 119 fghijklmnopqrstuvwxyz 119
abcde 120 fghijklmnopqrstuvwxyz 120
xyz 120
abcde 121 fghijklmnopqrstuvwxyz 121
abcde 122 fghijklmnopqrstuvwxyz 122
abcde 123 fghijklmnopqrstuvwxyz 123
abcde 124 fghijklmnopqrstuvwxyz 124
abcde 125 fghijklmnopqrstuvwxyz 125
...
abcde 127 fghijklmnopqrstuvwxyz 127
abcde 299 fghijklmnopqrstuvwxyz 299
abcde 300 fghijklmnopqrstuvwxyz 300
abcde 301 fghijklmnopqrstuvwxyz 301
abcde 302 fghijklmnopqrstuvwxyz 302
abcde 303 fghijklmnopqrstuvwxyz 303
abcde 304 fghijklmnopqrstuvwxyz 304
fghijklmnopqrstuvwxyz 304
abcde 305 fghijklmnopqrstuvwxyz 305
abcde 306 fghijklmnopqrstuvwxyz 306
abcde 307 fghijklmnopqrstuvwxyz 307
abcde 308 fghijklmnopqrstuvwxyz 308
I redacted most of the non-problematic lines for brevity. We see that
"xyz 120" and "fghijklmnopqrstuvwxyz 304" were printed twice even
though they appear in the data file only once. The issue is that in
roll_file() we fstat() to find the file size, then read() as much data
as we can and then use the previously saved file size to mark our
position. The bug occurs if we read past the file size reported by
fstat() because more data has arrived while we were reading it. The
attached patch uses the current file position as the location marker
instead, with some extra logic to handle tailing truncated files.
dima
>From f15b668052cba235cd63c4afe2f5e74ca9277060 Mon Sep 17 00:00:00 2001
From: Dima Kogan <dkogan@xxxxxxxxxxxxxxx>
Date: Sat, 14 Aug 2010 01:55:13 -0700
Subject: [PATCH] tailf: fixed timing issue that could cause duplicate data output
Signed-off-by: Dima Kogan <dkogan@xxxxxxxxxxxxxxx>
---
text-utils/tailf.c | 14 +++++++++++++-
1 files changed, 13 insertions(+), 1 deletions(-)
diff --git a/text-utils/tailf.c b/text-utils/tailf.c
index 75998ce..b4d51a6 100644
--- a/text-utils/tailf.c
+++ b/text-utils/tailf.c
@@ -111,8 +111,20 @@ roll_file(const char *filename, off_t *size)
}
fflush(stdout);
}
+
+ off_t pos = lseek(fd, 0, SEEK_CUR);
+
+ // If we've successfully read something, use the file position:
+ // this avoids data duplication
+ if(pos != -1 && pos != *size)
+ *size = pos;
+
+ // If we read nothing or hit an error, reset to the reported size:
+ // this handles truncated files
+ else
+ *size = st.st_size;
+
close(fd);
- *size = st.st_size;
}
static void
--
1.7.1