On Thu, May 16, 2013 at 11:35:08AM -0400, David Oostdyk wrote: > On 05/16/13 07:36, Stan Hoeppner wrote: > >On 5/15/2013 7:59 PM, Dave Chinner wrote: > >>[cc xfs list, seeing as that's where all the people who use XFS in > >>these sorts of configurations hang out. ] > >> > >>On Fri, May 10, 2013 at 10:04:44AM -0400, David Oostdyk wrote: > >>>As a basic benchmark, I have an application > >>>that simply writes the same buffer (say, 128MB) to disk repeatedly. > >>>Alternatively you could use the "dd" utility. (For these > >>>benchmarks, I set /proc/sys/vm/dirty_bytes to 512M or lower, since > >>>these systems have a lot of RAM.) > >>> > >>>The basic observations are: > >>> > >>>1. "single-threaded" writes, either a file on the mounted > >>>filesystem or with a "dd" to the raw RAID device, seem to be limited > >>>to 1200-1400MB/sec. These numbers vary slightly based on whether > >>>TurboBoost is affecting the writing process or not. "top" will show > >>>this process running at 100% CPU. > >>Expected. You are using buffered IO. Write speed is limited by the > >>rate at which your user process can memcpy data into the page cache. > >> > >>>2. With two benchmarks running on the same device, I see aggregate > >>>write speeds of up to ~2.4GB/sec, which is closer to what I'd expect > >>>the drives of being able to deliver. This can either be with two > >>>applications writing to separate files on the same mounted file > >>>system, or two separate "dd" applications writing to distinct > >>>locations on the raw device. > >2.4GB/s is the interface limit of quad lane 6G SAS. Coincidence? If > >you've daisy chained the SAS expander backplanes within a server chassis > >(9266-8i/72405), or between external enclosures (9285-8e/71685), and > >have a single 4 lane cable (SFF-8087/8088/8643/8644) connected to your > >RAID card, this would fully explain the 2.4GB/s wall, regardless of how > >many parallel processes are writing, or any other software factor. > > > >But surely you already know this, and you're using more than one 4 lane > >cable. Just covering all the bases here, due to seeing 2.4 GB/s as the > >stated wall. This number is just too coincidental to ignore. > > We definitely have two 4-lane cables being used, but this is an > interesting coincidence. I'd be surprised if anyone could really > achieve the theoretical throughput on one cable, though. We have > one JBOD that only takes a single 4-lane cable, and we seem to cap > out at closer to 1450MB/sec on that unit. (This is just a single > point of reference, and I don't have many tests where only one > 4-lane cable was in use.) You can get pretty close to the theoretical limit on the back end SAS cables - just like you can with FC. What I'd suggest you do is look at the RAID card configuration - often they default to active/passive failover configurations when there are multiple channels to the same storage. Then hey only use one of the cables for all traffic. Some RAID cards offer ative/active or "load balanced" options where all back end paths are used in redundant configurations rather than just one.... > You guys hit the nail on the head! With O_DIRECT I can use a single > writer thread and easily see the same throughput that I _ever_ saw > in the multiple-writer case (~2.4GB/sec), and "top" shows the writer > at 10% CPU usage. I've modified my application to use O_DIRECT and > it makes a world of difference. Be aware that O_DIRECT is not a magic bullet. It can make your IO go a lot slower on some worklaods and storage configs.... > [It's interesting that you see performance benefits for O_DIRECT > even with a single SATA drive. The reason it took me so long to > test O_DIRECT in this case, is that I never saw any significant > benefit from using it in the past. But that is when I didn't have > such fast storage, so I probably wasn't hitting the bottleneck with > buffered I/O?] Right - for applications not designed to use direct IO from the ground up, this is typically the case - buffered IO is faster right up to the point where you run out of CPU.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs