Some FS comparisons attached in pdf not sure what to make of them as yet, but worth posting On Tue, Nov 3, 2009 at 12:11 PM, mark delfman <markdelfman@xxxxxxxxxxxxxx> wrote: > Thanks Neil, > > I seem to recall that I tried this on EXT3 and saw the same results as > XFS, but with your code and suggestions I think it is well worth me > trying some more tests and reporting back.... > > > Mark > > On Tue, Nov 3, 2009 at 4:58 AM, Neil Brown <neilb@xxxxxxx> wrote: >> On Saturday October 31, markdelfman@xxxxxxxxxxxxxx wrote: >>> >>> I am hopeful that you or another member of this group could offer some >>> advice / patch to implement the print options you suggested... if so i >>> would happily allocated resource and time to do what i can to help >>> with this. >> >> >> I've spent a little while exploring this. >> It appears to very definitely be an XFS problem, interacting in >> interesting ways with the VM. >> >> I built a 4-drive raid6 and did some simple testing on 2.6.28.5 and >> 2.6.28.6 using each of xfs and ext2. >> >> ext2 gives write throughput of 65MB/sec on .5 and 66MB/sec on .6 >> xfs gives 86MB/sec on .5 and only 51MB/sec on .6 >> >> >> When write_cache_pages is called it calls 'writepage' some number of >> times. On ext2, writepage will write at most one page. >> On xfs writepage will sometimes write multiple pages. >> >> I created a patch as below that prints (in a fairly cryptic way) >> the number of 'writepage' calls and the number of pages that XFS >> actually wrote. >> >> For ext2, the number of writepage calls is at most 1536 and averages >> around 140 >> >> For xfs with .5, there is usually only one call to writepage and it >> writes around 800 pages. >> For .6 there are about 200 calls to writepages but the achieve >> an average of about 700 pages together. >> >> So as you can see, there is very different behaviour. >> >> I notice a more recent patch in XFS in mainline which looks like a >> dirty hack to try to address this problem. >> >> I suggest you try that patch and/or take this to the XFS developers. >> >> NeilBrown >> >> >> >> diff --git a/mm/page-writeback.c b/mm/page-writeback.c >> index 08d2b96..aa4bccc 100644 >> --- a/mm/page-writeback.c >> +++ b/mm/page-writeback.c >> @@ -875,6 +875,8 @@ int write_cache_pages(struct address_space *mapping, >> int cycled; >> int range_whole = 0; >> long nr_to_write = wbc->nr_to_write; >> + long hidden_writes = 0; >> + long clear_writes = 0; >> >> if (wbc->nonblocking && bdi_write_congested(bdi)) { >> wbc->encountered_congestion = 1; >> @@ -961,7 +963,11 @@ continue_unlock: >> if (!clear_page_dirty_for_io(page)) >> goto continue_unlock; >> >> + { int orig_nr_to_write = wbc->nr_to_write; >> ret = (*writepage)(page, wbc, data); >> + hidden_writes += orig_nr_to_write - wbc->nr_to_write; >> + clear_writes ++; >> + } >> if (unlikely(ret)) { >> if (ret == AOP_WRITEPAGE_ACTIVATE) { >> unlock_page(page); >> @@ -1008,12 +1014,37 @@ continue_unlock: >> end = writeback_index - 1; >> goto retry; >> } >> + >> if (!wbc->no_nrwrite_index_update) { >> if (wbc->range_cyclic || (range_whole && nr_to_write > 0)) >> mapping->writeback_index = done_index; >> wbc->nr_to_write = nr_to_write; >> } >> >> + { static int sum, cnt, max; >> + static unsigned long previous; >> + static int sum2, max2; >> + >> + sum += clear_writes; >> + cnt += 1; >> + >> + if (max < clear_writes) max = clear_writes; >> + >> + sum2 += hidden_writes; >> + if (max2 < hidden_writes) max2 = hidden_writes; >> + >> + if (cnt > 100 && time_after(jiffies, previous + 10*HZ)) { >> + printk("write_page_cache: sum=%d cnt=%d max=%d mean=%d sum2=%d max2=%d mean2=%d\n", >> + sum, cnt, max, sum/cnt, >> + sum2, max2, sum2/cnt); >> + sum = 0; >> + cnt = 0; >> + max = 0; >> + max2 = 0; >> + sum2 = 0; >> + previous = jiffies; >> + } >> + } >> return ret; >> } >> EXPORT_SYMBOL(write_cache_pages); >> >> >> ------------------------------------------------------ >> From c8a4051c3731b6db224482218cfd535ab9393ff8 Mon Sep 17 00:00:00 2001 >> From: Eric Sandeen <sandeen@xxxxxxxxxxx> >> Date: Fri, 31 Jul 2009 00:02:17 -0500 >> Subject: [PATCH] xfs: bump up nr_to_write in xfs_vm_writepage >> >> VM calculation for nr_to_write seems off. Bump it way >> up, this gets simple streaming writes zippy again. >> To be reviewed again after Jens' writeback changes. >> >> Signed-off-by: Christoph Hellwig <hch@xxxxxxxxxxxxx> >> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxxx> >> Cc: Chris Mason <chris.mason@xxxxxxxxxx> >> Reviewed-by: Felix Blyakher <felixb@xxxxxxx> >> Signed-off-by: Felix Blyakher <felixb@xxxxxxx> >> --- >> fs/xfs/linux-2.6/xfs_aops.c | 8 ++++++++ >> 1 files changed, 8 insertions(+), 0 deletions(-) >> >> diff --git a/fs/xfs/linux-2.6/xfs_aops.c b/fs/xfs/linux-2.6/xfs_aops.c >> index 7ec89fc..aecf251 100644 >> --- a/fs/xfs/linux-2.6/xfs_aops.c >> +++ b/fs/xfs/linux-2.6/xfs_aops.c >> @@ -1268,6 +1268,14 @@ xfs_vm_writepage( >> if (!page_has_buffers(page)) >> create_empty_buffers(page, 1 << inode->i_blkbits, 0); >> >> + >> + /* >> + * VM calculation for nr_to_write seems off. Bump it way >> + * up, this gets simple streaming writes zippy again. >> + * To be reviewed again after Jens' writeback changes. >> + */ >> + wbc->nr_to_write *= 4; >> + >> /* >> * Convert delayed allocate, unwritten or unmapped space >> * to real space and flush out to disk. >> -- >> 1.6.4.3 >> >> >
Attachment:
FS test.pdf
Description: Adobe PDF document