[PATCH 2/3] generic/465: only complain about stale disk contents when racing directio

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



From: Darrick J. Wong <djwong@xxxxxxxxxx>

This test does a strange thing with directio -- it races a reader thread
with an appending aio writer thread and checks that the reader thread
only ever sees a (probably short) buffer containing the same contents
that are being read.

However, this has never worked correctly on XFS, which supports
concurrent readers and writers for directio.  Say you already have a
file with a single written mapping A:

AAAAAAAAAA
0        EOF

Then one thread initiates an aligned appending write:

AAAAAAAAAA---------
0        EOF      new_EOF

However, the free space is fragmented, so the file range maps to
multiple extents b and c (lowercase means unwritten here):

AAAAAAAAAAbbbbccccc
0        EOF      new_EOF

This implies separate bios for b and c.  Both bios are issued, but c
completes first.  The ioend for c will extend i_size all the way to
new_EOF.  Extent b is still marked unwritten because it hasn't completed
yet.

Next, the test reader slips in and tries to read the range between the
old EOF and the new EOF.  The file looks like this now:

AAAAAAAAAAbbbbCCCCC
0        EOF      new_EOF

So the reader sees "bbbbCCCCC" in the mapping, and the buffer returned
contains a range of zeroes followed by whatever was written to C.

For pagecache IO I would say that i_size should not be extended until
the extending write is fully complete, but the pagecache also
coordinates access so that reads and writes cannot conflict.

However, this is directio.  Reads and writes to the storage device can
be issued and acknowledged in any order.  I asked Ted and Jan about this
point, and they echoed that for directio it's expected that application
software must coordinate access themselves.

In other words, the only thing that the reader can check here is that
the filesystem is not returning stale disk contents.  Amend the test so
that null bytes in the reader buffer are acceptable.

Cc: tytso@xxxxxxx, jack@xxxxxxx
Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
---
 .../aio-dio-append-write-read-race.c               |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/src/aio-dio-regress/aio-dio-append-write-read-race.c b/src/aio-dio-regress/aio-dio-append-write-read-race.c
index 911f27230b..d9f8982f00 100644
--- a/src/aio-dio-regress/aio-dio-append-write-read-race.c
+++ b/src/aio-dio-regress/aio-dio-append-write-read-race.c
@@ -191,7 +191,7 @@ int main(int argc, char *argv[])
 		}
 
 		for (j = 0; j < rdata.read_sz; j++) {
-			if (rdata.buf[j] != 'a') {
+			if (rdata.buf[j] != 'a' && rdata.buf[j] != 0) {
 				fail("encounter an error: "
 					"block %d offset %d, content %x\n",
 					i, j, rbuf[j]);




[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux