On 04/11/13 21:11, Nicholas A. Bellinger wrote: > On Thu, 2013-04-11 at 19:07 +0000, Rustad, Mark D wrote: >> On Apr 11, 2013, at 11:53 AM, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote: >> >>> My guess is that something between writeback and the RAID10 is blocking >>> incoming WRITEs. >> Don't forget about the drives themselves. I have found that even when >> using "near-line" class SATA drives, they will sometimes go off and >> spend time doing something like some track validation. 1 - 2 seconds >> is in line with what I have seen. I have not seen that behavior with >> lower capacity "real" enterprise drives, generally with SAS or FC >> interfaces. >> >> Of course it is not really so much about the interface, but rather the >> techniques that the drive makers use to ensure reliability with the >> really high capacity drives. >> >> Are the two systems being compared using the same make and model of >> drives? >> > That reminds me.. Some SATA HDDs ship with WCE=1 set by default to > favor performance, while most every SAS HHD that I've seen ships with > WCE=0 to favor consistency in the face of power failure. > > Ferry, can you verify the drive firmware settings for WCE with: > > sdparm --get=WCE /dev/sda > /dev/sda: ATA ST3320620AS 3.AA > WCE 1 > > Thanks, > > --nab > Hi, sorry for the late response. WCE was disabled on the SAS drives by default. I have enabled it but this just switched the behaviour :). Reads have much larger latency spikes now where the write spikes are largely eliminated. In another post in this thread this link was posted: http://serverfault.com/questions/126413/limit-linux-background-flush-dirty-pages And that seems to be the case thus. Now writes are first buffered by linux and it seems to flush it in 1 go, filling disk queue and cache in 1 go probably not leaving room for reads to go fast. I still have to look into the options presented here by tuning the dirty cache, but it was about time I responded too :). If anyone has a less invasive way to flush the cache, other than what's presented in the link, that woul d be nice too :). In order to get the maximum possible performance a lot will have to be tuned apparently - not sure if everything required can actually be tuned. As the post already states: "For dirty cache to work out well in this situation, linux kernel background flusher would need to average at what speed the underlying device accepts requests and adjust background flushing accordingly. Not easy." I haven't seen such options - but bear in mind - regarding this material I'm just a newbie - so please don't shoot anyone based on my posts. As they're both in production is quite hard to generate similar workloads (especially without disturbing production load), so I look over a long(er) period of time for differences which slows things down considerably (and I have a ton of other stuff to do as well unfortunately). Thanks for the re's :). -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html