Re: recommendations for stripe/chunk size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Nail,

in message <18346.39756.292908.58065@xxxxxxxxxxxxxx> you wrote:
> 
> <quote>
> The second improvement is to remove a memory copy that is internal to the MD driver. The MD
> driver stages strip data ready to be written next to the I/O controller in a page size pre-
> allocated buffer. It is possible to bypass this memory copy for sequential writes thereby saving
> SDRAM access cycles.
> </quote>
> 
> I sure hope you've checked that the filesystem never (ever) changes a
> buffer while it is being written out.  Otherwise the data written to
> disk might be different from the data used in the parity calculation
> :-)

Sure. Note that usage szenarios of this implementation are  not  only
(actually  not  even  primarily)  focussed  on  using such a setup as
normal RAID server - instead processors like the 440SPe  will  likely
be  used  on  RAID  controller  cards itself - and data may come from
iSCSI or over one of the PCIe busses, but  not  from  a  normal  file
system.

> And what are the "Second memcpy" and "First memcpy" in the graph?
> I assume one is the memcpy mentioned above, but what is the other?

Avoiding the 1st memcpy means to skip the system block level caching,
i. e. try to use DIRECT_IO capability  ("-dio"  option  to  xdd  tool
which was used for these benchmarks).

The 2nd memcpy is the optimization for large  sequential  writes  you
quoted above.

Please keep  in  mind  that  these  optimizations  are  probably  not
directly  useful  for  general purpose use of a normal file system on
top of the RAID array; they have other goals: provide benchmarks  for
the  special  case  of  large synchrounous I/O operations (as used by
RAID controller manufacturers to show off their competitors), and  to
provide a base for the firmware of such controllers.

Nevertheless, they clearly show  where  optimizations  are  possible,
assuming you understand exactly your usuage szenario.

In real life, your  optimization  may  require  completely  different
strategies  -  for  example,  on  our  main file server we see such a
distribution of file sizes:

Out of a sample of 14.2e6 files,

	 65%    are smaller than  4 kB
	 80% 	are smaller than  8 kB
	 90% 	are smaller than 16 kB
	 96% 	are smaller than 32 kB
	 98.4% 	are smaller than 64 kB
	
You don't want - for example - huge stripe sizes in such a system.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@xxxxxxx
Egotist: A person of low taste, more interested in  himself  than  in
me.                                                  - Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux