Re: XFS hangs and freezes with LSI 9265-8i controller on high i/o

Bernd Schubert <bernd.schubert@xxxxxxxxxxxxxxxxxx> · Fri, 15 Jun 2012 16:22:40 +0200

On 06/15/2012 02:30 PM, Dave Chinner wrote:
On Fri, Jun 15, 2012 at 01:25:26PM +0200, Bernd Schubert wrote:
On 06/15/2012 02:16 AM, Dave Chinner wrote:
Oh, I just noticed you are might be using CFQ (it's the default in
dmesg). Don't - CFQ is highly unsuited for hardware RAID - it's
hueristically tuned to work well on sngle SATA drives. Use deadline,
or preferably for hardware RAID, noop.

I'm not sure if noop is really a good recommendation even with hw
raid, especially if the the request queue size is high. This week I
did some benchmarks with a high rq write size (triggered with
sync_file_range(..., SYNC_FILE_RANGE_WRITE) ) and with noop
concuring reads then almost entirely got stalled.
With deadline read/write balance was much better, although writes
still had been preferred (with sync_file_range() and without). I
always thought deadline prefers reads and I hope I find some time
later on to investigate further what was going on.
Test had been on a netapp E5400 hw raid, so rather high end hw raid.

Sounds like a case of the IO scheduler queue and/or CTQ being too
deep.

Hmm yes probably. With a small request queue and the usage of 
sync_file_range(..., SYNC_FILE_RANGE_WRITE) we only have a small page 
cache buffer. And sync_file_range is required to get perfect IO sizes as 
given by max_sectors_kb.  Without sync_file_range IOs have more or less 
random size, but very rarely aligned to the raid-stripe-size (and yes, 
mkfs.xfs options are correctly set). That is another issue I need to 
find time to investigate.

Cheers,
Bernd

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs