On 2014-07-25 09:43, Jens Axboe wrote:
On 2014-07-21 22:25, Vasily Tarasov wrote:
Hi Jens,
I tried your patch, but it didn't help. Interestingly, the number of
threads changes in the end. At first, during the run:
# ps -eLf | grep fio
root 5224 4274 5224 1 2 11:12 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5225 0 2 11:12 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5231 5224 5231 60 1 11:12 ? 00:00:07 fio
--status-interval 10 --minimal fios/1.fio
root 5260 5237 5260 0 1 11:12 pts/0 00:00:00 grep fio
[root@bison01 vass]# ps -eLf | grep fio
root 5224 4274 5224 0 2 11:12 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5225 0 2 11:12 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5231 5224 5231 16 1 11:12 ? 00:00:21 fio
--status-interval 10 --minimal fios/1.fio
root 5293 5237 5293 0 1 11:14 pts/0 00:00:00 grep fio
[root@bison01 vass]# ps -eLf | grep fio
root 5224 4274 5224 0 2 11:12 pts/1 00:00:01 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5225 0 2 11:12 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5231 5224 5231 12 1 11:12 ? 00:01:13 fio
--status-interval 10 --minimal fios/1.fio
root 5411 5237 5411 0 1 11:22 pts/0 00:00:00 grep fio
Later, when the threads are stuck:
# ps -eLf | grep fio
root 5224 4274 5224 0 16 11:12 pts/1 00:00:02 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5225 0 16 11:12 pts/1 00:00:01 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5458 0 16 11:25 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5459 0 16 11:25 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5460 0 16 11:25 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5461 0 16 11:25 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5462 0 16 11:25 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5471 0 16 11:25 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5472 0 16 11:26 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5475 0 16 11:26 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5476 0 16 11:26 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5477 0 16 11:26 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5478 0 16 11:26 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5487 0 16 11:26 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5488 0 16 11:27 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 5224 4274 5489 0 16 11:27 pts/1 00:00:00 fio
--status-interval 10 --minimal fios/1.fio
root 6665 5237 6665 0 1 13:21 pts/0 00:00:00 grep fio
Is the number of threads supposed to change?..
Never answered this one... Yes, it'll change, since when you run the
job, you'll have one backend process, a number of IO workers, and one
disk util thread typically. When you get stuck, it's the backend that is
left waiting for that mutex.
In any case, I haven't been able to figure this one out yet. But it
should be safe enough to just ignore the stat mutex for the final
output, since the threads otherwise accessing it are gone. Can you see
if this one makes the issue go away?
Patch was not compiled, was missing the non-static __show_run_stats().
But just pull current -git, I have committed a variant that does compile :-)
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html