Thanks a lot. Everything you said made complete sense to me but when i tried running with following options my read is so slow (basically with direct io, that with 1MB/s it will just take 32minutes to read 32MB data) yet my write is doing fine. Should i use some other options of dd (though i understand that with direct we bypass all caches, but direct doesn't guarantee that everything is written when call returns to user for which i am using fdatasync).
time dd if=/dev/shm/image of=/dev/sbd0 bs=4096 count=262144 oflag=direct conv=fdatasync
time dd if=/dev/pdev0 of=/dev/null bs=4096 count=2621262144+0 records in
262144+0 records out
1073741824 bytes (1.1 GB) copied, 17.7809 s, 60.4 MB/s
real 0m17.785s
user 0m0.152s
sys 0m1.564s
I interrupted the dd for read because it was taking too much time with 1MB/s :
time dd if=/dev/pdev0 of=/dev/null bs=4096 count=262144 iflag=direct conv=fdatasync
^C150046+0 records in
150045+0 records out
614584320 bytes (615 MB) copied, 600.197 s, 1.0 MB/s
real 10m0.201s
user 0m2.576s
sys 0m0.000s
Thanks,
Neha
On Thu, Apr 11, 2013 at 1:49 PM, Greg Freemyer <greg.freemyer@xxxxxxxxx> wrote:
On Thu, Apr 11, 2013 at 2:50 PM, neha naik <nehanaik27@xxxxxxxxx> wrote:I assume your issue is caching somewhere.
> Yes. Interestingly my direct write i/o performance is better than my direct
> read i/o performance for my passthrough device... And that doesn't make any
> kind of sense to me.
>
> pdev0 = pass through device on top of lvm
>
> root@voffice-base:/home/neha/sbd# time dd if=/dev/pdev0 of=/dev/null bs=4096
> count=1024 iflag=direct
> 1024+0 records in
> 1024+0 records out
> 4194304 bytes (4.2 MB) copied, 4.09488 s, 1.0 MB/s
>
> real 0m4.100s
> user 0m0.028s
> sys 0m0.000s
>
> root@voffice-base:/home/neha/sbd# time dd if=/dev/shm/image of=/dev/pdev0
> bs=4096 count=1024 oflag=direct
> 1024+0 records in
> 1024+0 records out
> 4194304 bytes (4.2 MB) copied, 0.0852398 s, 49.2 MB/s
>
> real 0m0.090s
> user 0m0.004s
> sys 0m0.012s
>
> Thanks,
> Neha
If in the top levels of the kernel, dd has various fsync, fdatasync,
etc. options that should address that. I note you aren't using any of
them.
You mention LVM. It should pass cache flush commands down, but some
flavors of mdraid will not the last I knew. ie. Raid 6 used to
discard cache flush commands iirc. I don't know if that was ever
fixed or not.
If the cache is in hardware, then dd's cache flushing calls may or may
not get propagated all the way to the device. Some battery backed
caches actually intentionally reply ACK to a cache flush command
without actually doing it.
Further, you're only writing 4MB. Not much of a test for most
devices. A sata drive will typically have at least 32MB of cache.
One way to ensure that results are not being corrupted by the various
caches up and down the storage stack is to write so much data you
overwhelm the caches. That can be a huge amount of data in some
systems. ie. A server with 128 GB or ram may use 10's of GB for
cache.
As you can see, testing of the write path for performance can take a
significant effort to ensure caches are not biasing your results.
HTH
Greg
_______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies