Re: Exit all jobs on error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/11/2015 01:32 PM, Andrey Kuzmin wrote:

On Dec 11, 2015 22:59, "Sitsofe Wheeler" <sitsofe@xxxxxxxxx
<mailto:sitsofe@xxxxxxxxx>> wrote:
 >
 > On 11 December 2015 at 15:32, Jens Axboe <axboe@xxxxxxxxx
<mailto:axboe@xxxxxxxxx>> wrote:
 > > On 12/11/2015 03:01 AM, Andrey Kuzmin wrote:
 > >>
 > >> ^Cbs: 1 (f=1): [w(1)] [0.0% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
 > >> 01d:12h:24m:29s]
 > >> Program received signal SIGINT, Interrupt.
 > >> 0x00007ffff6b7ff3d in nanosleep () at
 > >> ../sysdeps/unix/syscall-template.S:81
 > >> 81 ../sysdeps/unix/syscall-template.S: No such file or directory.
 > >> (gdb) bt
 > >> #0  0x00007ffff6b7ff3d in nanosleep () at
 > >> ../sysdeps/unix/syscall-template.S:81
 > >> #1  0x00007ffff6bb14a4 in usleep (useconds=<optimized out>) at
 > >> ../sysdeps/unix/sysv/linux/usleep.c:32
 > >> #2  0x000000000045a7ed in do_usleep (usecs=10000) at backend.c:1951
 > >> #3  0x000000000045b33c in run_threads () at backend.c:2216
 > >> #4  0x000000000045b6a8 in fio_backend () at backend.c:2333
 > >> #5  0x00000000004991cb in main (argc=4, argv=0x7fffffffdda8,
 > >> envp=0x7fffffffddd0) at fio.c:60
 > >
 > >
 > > That's not one of the IO threads, that's the main thread. It'll sit
and wait
 > > in that loop until jobs finish. You'll need the backtrace of one of the
 > > stuck IO thread instead, this trace is quite normal and expected of
backend.
 > >
 > > --
 > > Jens Axboe
 > >
 >
 > Andrey:
 >
 > Could you try
 > thread apply all bt full
 > (found over on
https://wiki.gentoo.org/wiki/Project:Quality_Assurance/Backtraces
 > )?
 >

That test case is already gone, but - if interested - you can easily
simulate it by randomly dropping an io_u inside the engine.

To follow up on this, since apparently parts of that thread ended up outside of the mailing list.

If you drop an io_u inside the engine, then fio will of course get stuck waiting for completions. That would be an IO engine bug. Fio does not track timeouts internally, because it does not have to:

For the more real case of being stuck waiting for IO that has been submitted to the kernel, we strictly depend on the kernel completing those IOs. If not, that's a kernel bug, and it won't matter if we explicitly wait for the IO, since it'll happen in any case when we drop the aio context. Either the IO gets completed by the device, or a driver timeout will take care of completing it in either. In either case, we get a completion event.

There's no fio bug here.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux