From: Darrick J. Wong <djwong@xxxxxxxxxx> This test does a strange thing with directio -- it races a reader thread with an appending aio writer thread and checks that the reader thread only ever sees a (probably short) buffer containing the same contents that are being read. However, this has never worked correctly on XFS, which supports concurrent readers and writers for directio. Say you already have a file with a single written mapping A: AAAAAAAAAA 0 EOF Then one thread initiates an aligned appending write: AAAAAAAAAA--------- 0 EOF new_EOF However, the free space is fragmented, so the file range maps to multiple extents b and c (lowercase means unwritten here): AAAAAAAAAAbbbbccccc 0 EOF new_EOF This implies separate bios for b and c. Both bios are issued, but c completes first. The ioend for c will extend i_size all the way to new_EOF. Extent b is still marked unwritten because it hasn't completed yet. Next, the test reader slips in and tries to read the range between the old EOF and the new EOF. The file looks like this now: AAAAAAAAAAbbbbCCCCC 0 EOF new_EOF So the reader sees "bbbbCCCCC" in the mapping, and the buffer returned contains a range of zeroes followed by whatever was written to C. For pagecache IO I would say that i_size should not be extended until the extending write is fully complete, but the pagecache also coordinates access so that reads and writes cannot conflict. However, this is directio. Reads and writes to the storage device can be issued and acknowledged in any order. I asked Ted and Jan about this point, and they echoed that for directio it's expected that application software must coordinate access themselves. In other words, the only thing that the reader can check here is that the filesystem is not returning stale disk contents. Amend the test so that null bytes in the reader buffer are acceptable. Cc: tytso@xxxxxxx, jack@xxxxxxx Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> --- .../aio-dio-append-write-read-race.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/aio-dio-regress/aio-dio-append-write-read-race.c b/src/aio-dio-regress/aio-dio-append-write-read-race.c index 911f27230b..d9f8982f00 100644 --- a/src/aio-dio-regress/aio-dio-append-write-read-race.c +++ b/src/aio-dio-regress/aio-dio-append-write-read-race.c @@ -191,7 +191,7 @@ int main(int argc, char *argv[]) } for (j = 0; j < rdata.read_sz; j++) { - if (rdata.buf[j] != 'a') { + if (rdata.buf[j] != 'a' && rdata.buf[j] != 0) { fail("encounter an error: " "block %d offset %d, content %x\n", i, j, rbuf[j]);