Re: [whacky issue] xfs/277 endlessly looping in _require_xfs_io_command

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Wed, Oct 11, 2017 at 07:13:30PM +1100, Dave Chinner wrote:
> On Wed, Oct 11, 2017 at 07:00:47PM +1100, Dave Chinner wrote:
> > HI folks,
> > 
> > I was wondering if anyone else had seen this problem, because it's
> > got me absolutely stumped. One of my test VMs is having a really
> > weird livelock in xfs/277. It's getting stuck in an endless loop
> > burning the entire CPU in a the 277 process (i.e. running bash).
> > What it is stuck on makes no sense to me, nor does the looping
> > behaviour, and I can only reproduce it on this one machine.
> 
> FWIW, the trigger for this was enabling rmap+reflink on the test
> device on this machine for the first time. Hence this is the first
> time the test has been run on that machine...

TL;DR: this is a result of bugs in the xfstests code. "echo $var"
results in $var being evaluated for glob/regex matches with
files in the current working directory.


Longer: the fsmap
output contains test like [1234..5678] on every line, and there were
5,000 lines of output being generated with about 100k of characters
inside [] brackets. If you run 'echo [1234]', it
does not output '[1234]' to the console - it treats [1234] as a glob
and goes searching the current working directory for matches. That's
where the readdir output comes from.

So:

$ echo foo [1234..5768] bar
foo 1 bar
$ ls -l |grep " 1$"
-rw-r--r-- 1 dave dave         35 Apr 14  2010 1
$

The "1" is output because it matches a file in the local directory.

and if I strace the echo command. I see:

open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=28672, ...}) = 0
getdents(3, /* 1109 entries */, 32768)  = 31656
getdents(3, /* 0 entries */, 32768)     = 0
close(3) = 0
write(1, "foo 1 bar\n", 10)             = 10

So, that's where the readdir comes from. Doesn't explain the
slowness, or why it doesn't reproduce in a interactive shell or on
any other machine.


So, look at what directory it is running in ($here, the root of the
xfstests installation) and now add other xfstests bugs that havei
resulted in ~6500 test files being dumped in $here.

(generic/109 is one of the culprits)

Now we have ~100k characters being matched against ~6500 filenames.
There's the slowness.  I simply wasn't patient enough to wait for
it.

One `git clean -f -d` later, and the test now runs and completes in
4s.

So, what's the bug in xfstests? The usual bug in shell programs:i
missing variable quoting.

$ echo foo [1234..5768] bar 
foo 1 bar
$ echo "foo [1234..5768] bar"
foo [1234..5768] bar

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux