On Thu, Jan 27, 2011 at 06:09:58PM -0800, david@xxxxxxx wrote: > On Thu, 27 Jan 2011, Stan Hoeppner wrote: > >david@xxxxxxx put forth on 1/27/2011 2:11 PM: > > > >>how do I understand how to setup things on multi-disk systems? the documentation > >>I've found online is not that helpful, and in some ways contradictory. > > > >Visit http://xfs.org There you will find: > > > >Users guide: > >http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide//tmp/en-US/html/index.html > > > >File system structure: > >http://xfs.org/docs/xfsdocs-xml-dev/XFS_Filesystem_Structure//tmp/en-US/html/index.html > > > >Training labs: > >http://xfs.org/docs/xfsdocs-xml-dev/XFS_Labs/tmp/en-US/html/index.html > > thanks for the pointers. > > >>If there really are good rules for how to do this, it would be very helpful if > >>you could just give mkfs.xfs the information about your system (this partition > >>is on a 16 drive raid6 array) and have it do the right thing. > > > >If your disk array is built upon Linux mdraid, recent versions of mkfs.xfs will > >read the parameters and automatically make the filesystem accordingly, properly. > > > >mxfs.fxs will not do this for PCIe/x hardware RAID arrays or external FC/iSCSI > >based SAN arrays as there is no standard place to acquire the RAID configuration > >information for such systems. For these you will need to configure mkfs.xfs > >manually. > > > >At minimum you will want to specify stripe width (sw) which needs to match the > >hardware stripe width. For RAID0 sw=[#of_disks]. For RAID 10, sw=[#disks/2]. > >For RAID5 sw=[#disks-1]. For RAID6 sw=[#disks-2]. > > > >You'll want at minimum agcount=16 for striped hardware arrays. Depending on the > >number and spindle speed of the disks, the total size of the array, the > >characteristics of the RAID controller (big or small cache), you may want to > >increase agcount. Experimentation may be required to find the optimum > >parameters for a given hardware RAID array. Typically all other parameters may > >be left at defaults. > > does this value change depending on the number of disks in the array? Only depending on block device capacity. Once at the maximum AG size (1TB), mkfs has to add more AGs. So once above 4TB for hardware RAID LUNs and 16TB for md/dm devices, you will get an AG per TB of storage by default. As it is, the optimal number and size of AGs will depend on many geometry factors as workload factors, such as the size of the luns, the way they are striped, whether you are using linear concatenation of luns or striping them or a combination of both, the amount of allocation concurrency you require, etc. In these sorts of situations, mkfs can only make a best guess - to do better you really need someone proficient in the dark arts to configure the storage and filesystem optimally. > >Picking the perfect mkfs.xfs parameters for a hardware RAID array can be > >somewhat of a black art, mainly because no two vendor arrays act or perform > >identically. > > if mkfs.xfs can figure out how to do the 'right thing' for md raid > arrays, can there be a mode where it asks the users for the same > information that it gets from the kernel? mkfs.xfs can get the information it needs directly from dm and md devices. However, when hardware RAID luns present themselves to the OS in an identical manner to single drives, how does mkfs tell the difference between a 2TB hardware RAID lun made up of 30x73GB drives and a single 2TB SATA drive? The person running mkfs should already know this little detail.... > >Systems of a caliber requiring XFS should be thoroughly tested before going into > >production. Testing _with your workload_ of multiple parameters should be > >performed to identify those yielding best performance. > > <rant> > the problem with this is that for large arrays, formatting the array > and loading it with data can take a day or more, even before you > start running the test. This is made even worse if you are scaling > up an existing system a couple orders of magnatude, because you may > not have the full workload available to you. If your hardware procurement-to-production process doesn't include testing performance of potential equipment on a representative workload, then I'd say you have a process problem that we can't help you solve.... > Saying that you should > test out every option before going into production is a cop-out. I never test every option. I know what the options do, so to decide what to tweak (if anything) what I first need to know is how a workload performs on a given storage layout with default options. I need to have: a) some idea of the expected performance of the workload b) a baseline performance characterisation of the underlying block devices c) a set of baseline performance metrics from a representative workload on a default filesystem d) spent some time analysing the baseline metrics for evidence of sub-optimal performance characteristics. Once I have that information, I can suggest meaningful ways (if any) to change the storage and filesystem configuration that may improve the performance of the workload. BTW, if you ask me how to optimise an ext4 filesystem for the same workload, I'll tell you straight up that I have no idea and that you should ask an ext4 expert.... > The better you can test it, the better off you are, but without > knowing what the knobs do, just doing a test and twiddling the > knobs to do another test isn't very useful. Well, yes, that is precisely the reason you should use the defaults. It's also the reason we have experts - they know what knob to twiddle to fix specific problems. If you prefer to twiddle knobs like Blind Freddy, then you should expect things to go wrong.... > If there is a way to > set the knobs in the general ballpark, Have you ever considered that this is exactly what mkfs does when you use the defaults? And that this is the fundamental reason we keep saying "use the defaults"? > then you can test and see > if the performance seems adaquate, if not you can try teaking one > of the knobs a little bit and see if it helps or hurts. but if the > knobs aren't even in the ballpark when you start, this doesn't > help much. The thread has now come full circle - you're ranting about not knowing what knobs do or how to set reasonable values so you want to twiddle random knobs them to see if they do anything as the basis of your optimisation process. This is the exact process that lead to the bug report that started this thread - a tweak-without- understanding configuration leading to undesirable behavioural characteristics from the filesystem..... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs