Re: How to do strict synchronous i/o on Windows?

Martin Steigerwald <Martin@xxxxxxxxxxxx> · Wed, 15 Aug 2012 13:31:51 +0200

Am Mittwoch, 15. August 2012 schrieb Greg Sullivan:
> On 15/08/2012, Martin Steigerwald <Martin@xxxxxxxxxxxx> wrote:
> > Am Mittwoch, 15. August 2012 schrieb Greg Sullivan:
> >> On 15 August 2012 07:24, Martin Steigerwald <Martin@xxxxxxxxxxxx> wrote:
> >> > Am Dienstag, 14. August 2012 schrieb Greg Sullivan:
> >> > > On 15/08/2012, Martin Steigerwald <Martin@xxxxxxxxxxxx> wrote:
> >> > > > Am Dienstag, 14. August 2012 schrieb Greg Sullivan:
> >> > > >> On 15/08/2012, Martin Steigerwald <Martin@xxxxxxxxxxxx> wrote:
> >> > > >> > Am Dienstag, 14. August 2012 schrieb Greg Sullivan:
> >> > > >> >> On 15 August 2012 03:36, Martin Steigerwald
> >> > > >> >> <Martin@xxxxxxxxxxxx>
> >> > > >> >> 
> >> > > >> >> wrote:
> >> > > >> >> > Hi Greg,
> >> > > >> > 
> >> > > >> > […]
> >> > > >> > 
> >> > > >> >> > Am Dienstag, 14. August 2012 schrieb Greg Sullivan:
> >> > > >> >> >> On Aug 14, 2012 11:06 PM, "Jens Axboe"
> >> > > >> >> >> <axboe@xxxxxxxxx>
> > 
> > wrote:
> >> > > >> >> >> > On 08/14/2012 08:24 AM, Greg Sullivan wrote:
> >> > […]
> >> > 
> >> > > >> >> Is it possible to read from more than file in a single
> >> > > >> >> job, in a round-robin fashion? I tried putting more than
> >> > > >> >> one file in a single job, but it only opened one file. If
> >> > > >> >> you mean to just do random reads in a single file - I've
> >> > > >> >> tried that, and the throughput is unrealistically low. I
> >> > > >> >> suspect it's because the read-ahead buffer cannot be
> >> > > >> >> effective for random accesses.  Of course, reading
> >> > > >> >> sequentially from a single file will result in a
> >> > > >> >> throughput that is far too high to simulate the
> >> > > >> >> application.
> >> > > >> > 
> >> > > >> > Have you tried
> >> > > >> > 
> >> > > >> >        nrfiles=int
> >> > > >> >        
> >> > > >> >               Number of files to use for this job. 
> >> > > >> >               Default: 1.
> >> > > >> >        
> >> > > >> >        openfiles=int
> >> > > >> >        
> >> > > >> >               Number of files to keep open at the same
> >> > > >> >               time. Default: nrfiles.
> >> > > >> >        
> >> > > >> >        file_service_type=str
> >> > > > 
> >> > > > […]
> >> > > > 
> >> > > >> > ? (see fio manpage).
> >> > > >> > 
> >> > > >> > It seems to me that all you need is nrfiles. I´d bet that
> >> > > >> > fio distributes
> >> > > >> > the I/O size given among those files, but AFAIR there is
> >> > > >> > something about
> >> > > >> > that in fio documentation as well.
> >> > > >> > 
> >> > > >> > Use the doc! ;)
> >> > > > 
> >> > > > […]
> >> > > > 
> >> > > >> Yes, I have tried all that, and it works, except that it
> >> > > >> causes disk queuing, as I stated in my first post. I thought
> >> > > >> you meant to put all the files into a single [job name]
> >> > > >> section of the ini file, to enforce single threaded io.
> >> > > > 
> >> > > > With just one job running at once?
> >> > > > 
> >> > > > Can you post an example job file?
> >> > > > 
> >> > > > Did you try the sync=1 / direct=1 suggestion from Bruce Chan?
> >> > > > 
> >> > > > I only know the behaviour of fio on Linux where I/O depth of
> >> > > > greater than one is only possible with libaio and direct=1.
> >> > > > The manpage hints at I/O depth is one for all synchronous I/O
> >> > > > engines, so I´d bet that refers to Windows as well.
> >> > > > 
> >> > > > Other than that I have no idea.
> >> > 
> >> > […]
> >> > 
> >> > > One INI file, but a seperate [job name] section for each file,
> >> > > yes. According to Jens, because each [job name] is a seperate
> >> > > thread, and iodepth acts at the thread level, there will still
> >> > > be queuing at the device level. If there were a way to do what
> >> > > I want I think Jens would have told me, unfortunately.   ;)
> >> > > 
> >> > > direct io does at least allow me to do cache-less reads though -
> >> > > thankyou.
> >> > 
> >> > My suggestion is to use one job with several files.
> >> > 
> >> > martin@merkaba:/tmp> cat severalfiles.job
> >> > [global]
> >> > size=1G
> >> > nrfiles=100
> >> > 
> >> > [read]
> >> > rw=read
> >> > 
> >> > [write]
> >> > stonewall
> >> > rw=write
> >> > 
> >> > (now these are two jobs, but stonewall lets the write job run
> >> > after the read one with cache invalidation if not disabled and if
> >> > supported by OS)
> >> > 
> >> > martin@merkaba:/tmp> fio severalfiles.job
> >> > read: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=sync, iodepth=1
> >> > write: (g=1): rw=write, bs=4K-4K/4K-4K, ioengine=sync, iodepth=1
> >> > 2.0.8
> >> > Starting 2 processes
> >> > read: Laying out IO file(s) (100 file(s) / 1023MB)
> >> > write: Laying out IO file(s) (100 file(s) / 1023MB)
> >> > Jobs: 1 (f=100)
> >> > read: (groupid=0, jobs=1): err= 0: pid=23377
> >> > [… lots of fast due to /tmp being a RAM-based filesystem – tmpfs
> >> > …]
> >> > 
> >> > 
> >> > martin@merkaba:/tmp> ls -lh read.1.* | head
> >> > -rw-r--r-- 1 martin martin 11M Aug 14 23:15 read.1.0
> >> > -rw-r--r-- 1 martin martin 11M Aug 14 23:15 read.1.1
> > 
> > […]
> > 
> >> > [… only first ten displayed …]
> >> > 
> >> > martin@merkaba:/tmp> find -name "read.1*" 2>/dev/null | wc -l
> >> > 100
> >> > 
> >> > 100 files a 11M, due to rounding issues that may nicely add up to
> >> > the one GiB.
> >> > 
> >> > Raw sizes are:
> >> > 
> >> > martin@merkaba:/tmp> ls -l read.1.* | head
> >> > -rw-r--r-- 1 martin martin 10737418 Aug 14 23:20 read.1.0
> >> > -rw-r--r-- 1 martin martin 10737418 Aug 14 23:20 read.1.1
> > 
> > […]
> > 
> >> > Note: When I used filename, fio just created one files regardless
> >> > of the nrfiles setting. I would have expected it to use the
> >> > filename as a prefix. There might be some way to have it do that.
> >> > 
> >> > Ciao,
> >> 
> >> Thanks - that runs, but it's still queuing. As I said before, I
> >> can't use the sync engine - I receive an error. Is there a
> >> synchronous engine available for Windows? Perhaps that's the only
> >> problem. Can you check to see whether your system is queuing at the
> >> file system/device level when you run that test?
> >> 
> >> I had attempted to put the files in a single job earlier - I think
> >> it may have been successfully accessing both files, but I didn't
> >> notice it in the output. I'm a raw beginner.
> > 
> > Did you try with
> > 
> > ioengine=windowsaio
> > 
> > +
> > 
> > iodepth=1 (should be default however I think)
> > 
> > 
> > Otherwise I have no idea. I never used fio on Windows so far.
> > 
> > It might help when you try to explain exactly which problem you want
> > to solve by the fio measurements. Multimedia streaming. Is it to
> > slow? What is
> > 
> > it why you want to do these measurements?
> 
> They are both defaults, and the output shows that both are being used.
> If you could tell me whether your system is generating queuing it
> would help, because if yours queues even when using the sync io
> engine, it means I'm wasting my time and fio simply needs to be
> augmented to support strict single threaded operation over multiple
> files.
> 
> I am wanting to determine whether the application in question is
> extracting a reasonable number of real time streams from any given
> storage system.

Just for the record since you got it working on Windows as well – it works for me:

merkaba:/tmp> cat severalfiles.job 
[global]
size=1G
nrfiles=100

[read]
rw=read

merkaba:/tmp> fio severalfiles.job
read: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=sync, iodepth=1
2.0.8
Starting 1 process

read: (groupid=0, jobs=1): err= 0: pid=4579
  read : io=1023.9MB, bw=2409.7MB/s, iops=616705 , runt=   425msec
    clat (usec): min=0 , max=54 , avg= 1.08, stdev= 0.64
     lat (usec): min=0 , max=54 , avg= 1.13, stdev= 0.66
    clat percentiles (usec):
     |  1.00th=[    0],  5.00th=[    1], 10.00th=[    1], 20.00th=[    1],
     | 30.00th=[    1], 40.00th=[    1], 50.00th=[    1], 60.00th=[    1],
     | 70.00th=[    1], 80.00th=[    1], 90.00th=[    1], 95.00th=[    2],
     | 99.00th=[    2], 99.50th=[    2], 99.90th=[   14], 99.95th=[   16],
     | 99.99th=[   23]
    lat (usec) : 2=92.14%, 4=7.74%, 10=0.01%, 20=0.10%, 50=0.01%
    lat (usec) : 100=0.01%
  cpu          : usr=22.41%, sys=76.42%, ctx=421, majf=0, minf=36
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=262100/w=0/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
   READ: io=1023.9MB, aggrb=2409.7MB/s, minb=2409.7MB/s, maxb=2409.7MB/s, mint=425msec, maxt=425msec

(if you wonder about the figures – thats RAM testing – Linux tmpfs ;)

100% at IO depth 1.

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html