Re: [patch, v3] add an aio test which closes the fd before destroying the ioctx

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Tue, Jun 24, 2014 at 03:34:27PM -0400, Jeff Moyer wrote:
> 
> By closing the file descriptor before calling io_destroy, you pretty
> much guarantee that the last put on the ioctx will be done in interrupt
> context (during I/O completion).  This behavior has unearthed bugs in
> the kernel in several different kernel versions, so let's add a test to
> poke at it.
> 
> The original test case was provided by Matt Cross.  He has graciously
> relicensed it under the GPL v2 or later so that it can be included in
> xfstests.  I've modified the test a bit so that it would generate a
> stable output format and to run for a fixed amount of time.
> 
> Signed-off-by: Jeff Moyer <jmoyer@xxxxxxxxxx>

Jeff, this test is causing xfstests to fail unmounts with EBUSY
frequently on some of my test VMs (i.e. in >60% of my test runs in
the past week).

$ sudo MKFS_OPTIONS="-m crc=1,finobt=1" ./check generic/323
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/x86_64 test2 3.16.0-dgc+
MKFS_OPTIONS  -- -f -m crc=1,finobt=1 /dev/vdb
MOUNT_OPTIONS -- /dev/vdb /mnt/scratch

generic/323 121s ... 121s
umount: /mnt/test: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
_check_xfs_filesystem: filesystem on /dev/vda has dirty log (see /home/dave/src/xfstests-dev/results//generic/323.full)
_check_xfs_filesystem: filesystem on /dev/vda is inconsistent (c) (see /home/dave/src/xfstests-dev/results//generic/323.full)
_check_xfs_filesystem: filesystem on /dev/vda is inconsistent (r) (see /home/dave/src/xfstests-dev/results//generic/323.full)
Ran: generic/323
Passed all 1 tests
$ sudo umount /mnt/test
$

i.e. something that the test is doing it leaving the superblock
referenced after all the processes have finished and exited, but an
immediate unmount after the test fails works just fine. So the
situation only persists for a couple of seconds. Adding a "sleep 5"
to the test just before it exits also makes the failure go away.

I have only ever seen this same issue with generic/208 - it's been
doing this randomly for as long as I can remember. That test is also
a aio+dio test and adding the same "sleep 5" makes that test no
longer show the issue.

IOWs, we now have two AIO+DIO tests showing the same symptoms that
no other tests show. This tends to point at AIO not being fully
cleaned up and completely freed by the time the processes
dispatching it have exit()d. This failure generally occurs when
there is other load on the system/disks backing the test VM (e.g.
running xfstests in multiple VMs at the same time) so I suspect it
has to do with IO completion taking a long time.

Can you spend some time trying to reproduce this and getting to the
bottom of whatever is triggering the unmount error?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux