Re: unexpected results from simple test?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 22 2009, Gurudas Pai wrote:
>
> Richard Kennedy wrote:
>> I been running a hopefully simple fio test, but I don't understand the
>> results.
>>
>> the test :-
>>
>> fio --name=f1 --directory=/sda --rw=rw --size=400m
>> 	--name=f2 --directory=/sdb --rw=write --size=1g --startdelay=10
>>
>> on linux 2.6.30-git
>>
>> running this takes quite while, 'time fio ...' gives
>> real	3m49.174s
>> user	0m0.838s
>> sys	0m7.990s
>>
>> (edited highlights of the fio output)
>> f1: (groupid=0, jobs=1): err= 0: pid=5181
>>   read : io=200MiB, bw=23,480KiB/s, iops=5,870, runt=  8722msec
>>   write: io=200MiB, bw=23,499KiB/s, iops=5,874, runt=  8715msec
>> f2: (groupid=0, jobs=1): err= 0: pid=5182
>>   write: io=1,024MiB, bw=77,231KiB/s, iops=19,307, runt= 13577msec
>>
>> The jobs seem to finish reasonable quickly, but then fio seem to be
>> stuck waiting for something. CPU usage is 100% but the CPU frequency
>> stays at the lowest level and the system is really sluggish in its
>> response. It does recover in time but this test does seem to torture my
>> system.
>>
>> strace shows fio repeating this :-
>> open and read several disk stats, such as 
>>
>> open("/sys/block/sdb/stat", O_RDONLY)   = 10
>> fstat(10, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff9ef70c000
>> read(10, "   45168    22897  2239004   1505"..., 4096) = 99
>> close(10)                               = 0
>> munmap(0x7ff9ef70c000, 4096)            = 0
>>
>> then...
>>
>> gettimeofday({1245665343, 22984}, NULL) = 0
>> gettimeofday({1245665343, 23015}, NULL) = 0
>> ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
>> gettimeofday({1245665343, 23086}, NULL) = 0
>> gettimeofday({1245665343, 23115}, NULL) = 0
>> gettimeofday({1245665343, 23150}, NULL) = 0
>> setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 250000}}, NULL) = 0
>> rt_sigreturn(0)                         = -1 EINTR (Interrupted system call)
>> wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
>> nanosleep({0, 10000000}, NULL)          = 0
>> wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
>> nanosleep({0, 10000000}, NULL)          = 0
>> ...
>> wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
>> nanosleep({0, 10000000}, NULL)          = 0
>>
>> then back to the open & reads..
>>
>> The machine is a AMD X2 64bit with 3G memory & fio is 1.26.
>>
>> Any ideas what's going on? or suggestions to help track it down?
>
> Your test is doing buffered IO, so around ~1.4 GB of memory is consumed  
> by io units, so machine is sluggish. And when test is about to finish,  
> fio does fsync, so all data is sync'd up, and it is taking little long  
> time. and the main thread/process is waiting for fsync to complete.  
> Please run the same test with direct IO .
> Add "direct=1" in your jobfile.

It wont fsync by default, so I don't think that is it. But if we get an
overview of where the other fio processes are stuck, it'll be a bit
easier to see what the problem is here :-)

If you are in fsync, then fio will also list an 'F' for that job in the
status line. So that should be easy to tell.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux