Re: mdadm software raid5 arrays?

"H. Langos" <henrik-vdr@xxxxxxxx> · Thu, 19 Nov 2009 16:01:50 +0100

On Thu, Nov 19, 2009 at 01:37:46PM +0000, Steve wrote:
> Pasi Kärkkäinen wrote:
>> You should use oflag=direct to make it actually write the file to disk..
>>   And now most probably the file will come from linux kernel cache.  
>> Use iflag=direct to read it actually from the disk.
>>   
>
> However, in the real world data _is_ going to be cached via the kernel  
> cache, at least (we hope) a stride's worth minimum. We're talking about  
> recording video aren't we, and that's surely almost always sequentially  
> written, not random seeks everywhere?

True. Video is going to be written and read sequentially. However the
effects of cache are always that of a short time gain. E.g. write caches 
mask a slow disk by signaling "ready" to the application while in reality the
kernel is still holding the data in RAM. If you continue to write at a speed
faster than the disk can handle, then cache will fill up and at some point
in time your application's write requests will be slowed down to what the
disk can handle. 

If however your application writes to the same block again, before the 
cache has been written to disk, then your cache truely has gained you 
performance even in the long run, by avoiding writing data that already 
has been replaced.

Same thing with read caches. They only help if you are reading the same data
again.

The effect that you _will_ see is that of reading ahead. That helps if 
your application reads one block, and then another and the kernel has 
already looked ahead and fetched more blocks than originally requested
from the disk.

This also has the effect of avoiding too many seeks if you are reading from 
more than one place on the disk at once .. but again. The effect in regard to
read throughput however fades away as you read large amounts of data only once.

What it boils down to is this:

  Caches improve latency, not throughput.

What read-ahead and write-caches will do in this scenario, is to help you
mask the effects of seeks on your disk by reading ahead and by aggregating
write requests and sorting them in a way that reduces seek times. In this
regard writing multiple streams is easier than reading. When writing stuff,
you can let your kernel decide to keep some of the data 10 or 15 seconds 
in RAM before commiting it to disk. However if you are _reading_ you will 
be pretty miffed if your video stalls for 15 seconds because the kernel
found something more interesting to read first :-)

> For completeness, the results are:
>
> #dd if=/dev/zero of=/srv/test/delete.me bs=1M count=1024 oflag=direct
> 1073741824 bytes (1.1 GB) copied, 25.2477 s, 42.5 MB/s

Interesting. The difference between this and the "oflag=fsync" is that
in the later the kernel gets to sort all of the write requests more or less
as its wants to. So I guess for recording video, the 73MB/s will be your
bandwidth, while this test here shows the performance that a data integrity 
focused application like e.g. a database will get from your RAID.

> # dd if=/srv/test/delete.me of=/dev/null bs=1M count=1024 iflag=direct
> 1073741824 bytes (1.1 GB) copied, 4.92771 s, 218 MB/s
>
> So, still no issue with recording entire transponders; using 1/4 of the  
> available raw bandwidth with no buffering.

Well, using 1/4 bandwidth by one client or shared by multiple clients can 
make all the difference.

How about making some tests with "cstream" ? I only did a quick apt-cache
search but it seems like cstream could be used to simulate clients with
various bandwidth needs and for measuring the bandwidth that is left.

> Interesting stuff, this :)

Very interesting indeed. Thanks for enriching this discussion with real
data!

cheers
-henrik

_______________________________________________
vdr mailing list
vdr@xxxxxxxxxxx
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr