On Fri, Oct 30, 2009 at 9:20 AM, Chris Worley <worleys@xxxxxxxxx> wrote: > I'm running fio benchmarks for 120 hours: > > fio --rw=randrw --bs=4k --rwmixread=60 --numjobs=64 --iodepth=64 > --sync=0 --direct=1 --randrepeat=0 --ioengine=libaio > --filename=/dev/sde --filename=/dev/sdf --name=test --loops=10000 > --size=322122547200 --runtime=432000 --group_reporting > > The fio threads are generally reporting 100K IOPS and each of the 64 > fio threads uses less than 10% CPU. > > But, every 13 hours (nearly to the minute), for ~200 seconds, the FIO > threads start consuming large amounts of CPU, and the IOPS drop to > ~40K: Josh is brilliant: (322122547200[bytes per test]*64[threads])/(4096[block size]*100000[IOPS]*3600[seconds per hour])=13 hours ...it's restarting the next loop and recalculating the random maps every 13 hours Chris > > top - 07:36:59 up 8 days, 8:12, 0 users, load average: 57.42, > 58.54, 58.00Tasks: 352 total, 17 running, 335 sleeping, 0 stopped, > 0 zombie > Cpu(s): 1.2%us, 14.8%sy, 0.0%ni, 72.4%id, 10.1%wa, 1.3%hi, 0.2%si, > 0.0%stMem: 49449752k total, 6643996k used, 42805756k free, > 205832k buffers > Swap: 0k total, 0k used, 0k free, 1432664k cached > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ > COMMAND22108 root 18 0 86144 10m 9948 R 80.6 0.0 137:55.21 > fio > 22074 root 18 0 86144 10m 9952 R 70.8 0.0 138:07.68 fio > 22101 root 18 0 86144 10m 9952 R 61.0 0.0 138:18.43 fio > 22095 root 16 0 86144 10m 9948 D 51.1 0.0 137:54.02 fio > 22134 root 16 0 86144 10m 9948 R 51.1 0.0 138:03.73 fio > 22068 root 16 0 86144 10m 9952 R 49.2 0.0 138:04.16 fio > 22083 root 16 0 86144 10m 9952 D 45.2 0.0 138:09.32 fio > 22071 root 16 0 86144 10m 9948 D 29.5 0.0 137:37.75 fio > 22128 root 16 0 86144 10m 9952 D 25.6 0.0 137:48.11 fio > 22092 root 16 0 86144 10m 9948 D 11.8 0.0 137:54.14 fio > 22097 root 16 0 86144 10m 9948 D 9.8 0.0 138:07.81 fio > ... > > There is nothing in the system to account for this (not occurring in > conjunction with i.e. logrotate or cron). > > I've repeated this with 1.21 and 1.31, on RHEL 5.3 and RHEL 5.4. The > 1.31/RHEL5.4 case, the period was ~15 hours... this was also a slower > machine. > > Any idea what's going on? > > Thanks, > > Chris > -- To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html