Dear Nail, in message <18346.39756.292908.58065@xxxxxxxxxxxxxx> you wrote: > > <quote> > The second improvement is to remove a memory copy that is internal to the MD driver. The MD > driver stages strip data ready to be written next to the I/O controller in a page size pre- > allocated buffer. It is possible to bypass this memory copy for sequential writes thereby saving > SDRAM access cycles. > </quote> > > I sure hope you've checked that the filesystem never (ever) changes a > buffer while it is being written out. Otherwise the data written to > disk might be different from the data used in the parity calculation > :-) Sure. Note that usage szenarios of this implementation are not only (actually not even primarily) focussed on using such a setup as normal RAID server - instead processors like the 440SPe will likely be used on RAID controller cards itself - and data may come from iSCSI or over one of the PCIe busses, but not from a normal file system. > And what are the "Second memcpy" and "First memcpy" in the graph? > I assume one is the memcpy mentioned above, but what is the other? Avoiding the 1st memcpy means to skip the system block level caching, i. e. try to use DIRECT_IO capability ("-dio" option to xdd tool which was used for these benchmarks). The 2nd memcpy is the optimization for large sequential writes you quoted above. Please keep in mind that these optimizations are probably not directly useful for general purpose use of a normal file system on top of the RAID array; they have other goals: provide benchmarks for the special case of large synchrounous I/O operations (as used by RAID controller manufacturers to show off their competitors), and to provide a base for the firmware of such controllers. Nevertheless, they clearly show where optimizations are possible, assuming you understand exactly your usuage szenario. In real life, your optimization may require completely different strategies - for example, on our main file server we see such a distribution of file sizes: Out of a sample of 14.2e6 files, 65% are smaller than 4 kB 80% are smaller than 8 kB 90% are smaller than 16 kB 96% are smaller than 32 kB 98.4% are smaller than 64 kB You don't want - for example - huge stripe sizes in such a system. Best regards, Wolfgang Denk -- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@xxxxxxx Egotist: A person of low taste, more interested in himself than in me. - Ambrose Bierce - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html