On 2014-07-25 18:34, Vasily Tarasov wrote:
Hi Jens, You'll be surprised but it did not help :( I used the latest code from git (fio-2.1.11-10-gae7e, commit ae7e050). Still see the same picture.
That's actually good news, since it didn't make a lot of sense. So lets see if we can't get to the bottom of this...
I don't know if it helps, but I see this behavior on a machine with 96GB of RAM. So, after buffered writes are over, fio waits for a long time till all dirty buffers hit the disk. But, even after there is no more disk activity, fio is still stuck for as long as I don't kill it. Regarding the number of threads. I do understand where the 3 threads can come from: 1) Backend thread (sort of a manager) 1) Worker thread(s) 2) Disk stats thread I my case I defined only one job instance, so I suppose there always should be only one worker thread. I don't understand how the total number of threads go to 10 in the end. <snip starts> $ ps -eLf | grep fio root 4427 4135 4427 0 15 07:44 pts/1 00:00:02 fio --minimal --status-interval 10 1.fio root 4427 4135 4636 0 15 07:56 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4637 0 15 07:57 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4638 0 15 07:57 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4647 0 15 07:57 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4650 0 15 07:57 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4651 0 15 07:57 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4652 0 15 07:57 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4653 0 15 07:58 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4654 0 15 07:58 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4663 0 15 07:58 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4664 0 15 07:58 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4666 0 15 07:58 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4668 0 15 07:58 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio root 4427 4135 4669 0 15 07:59 pts/1 00:00:00 fio --minimal --status-interval 10 1.fio <snip ends>
Can you try and gdb attach to it when it's hung and produce a new backtrace? It can't be off the final status run, I wonder if it's off the mutex down and remove instead.
-- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html