On Tue, Mar 09, 2010 at 02:38:57PM +0300, Michael Tokarev wrote: > Dave Chinner wrote: > > On Tue, Mar 09, 2010 at 01:16:01PM +0300, Michael Tokarev wrote: > >> Karel Zak wrote: > >>> I did almost all my tests with scsi_debug or MD RAID0 on scsi_debug. > >>> It works as expected. > >> Actually, for raid0, the alignment is questionable. Should it be a > >> multiple of chunk size or whole stripe size? I'm not sure, both ways > >> has bad and good sides.. But if it is the latter, the same issues > >> pops up again: do a 3-disk raid0 and you'll have to align to 3*2^N. > > > > Yes, alignment is still needed, especially for filesystems that can > > do stripe unit aligned allocation like XFS. If you don't align the > > filesystem properly, all the data IO will be mis-aligned to the > > underlying disks and stripe unit sized IO will hit multiple disks > > rather than just one.... > > I understand alignment is needed, the question is if the alignment > should be to chunk size or full-stripe size. In neither case it > will be bad for underlying disks. Depends on the RAID implementation. High end RAID arrays often have cache bypass features that are triggered by stripe width aligned and sized IOs. cwWhen receiving well formed IO they can more than double write performance because they are not limited by internal cache mirroring bandwidth (e.g. the controller magically switches to write-through for those well formed IOs instead of writeback). So from that perspective, alignment needs to be to stripe width, not stripe unit. Similarly for RAID5/6 alignment needs to be to stripe width, so that a well formed IO issued by the filesystem only hits one RAID5/6 stripe. FWIW, XFS takes great care to ensure that it doesn't place all it's allocation group headers on the same stripe unit. Failing to distribute the AG headers across all the ѕtripe units evenly loads the disks/luns in the stripe unevenly. As soon as you have uneven load on a stripe the performance tanks as stripe is only as fast as it's slowest member. Also, while XFS prefers to align to stripe unit, there are mount options to change the default allocation alignment to be stripe width based. Hence if you have large files and applications that are doing well formed IO, stripe width alignment of the filesystem to the underlying block device is critical to acheiving deterministic throughput close to the maximum the hardware can support..... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html