Re: MD write performance issue - found Catalyst patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Attached are later kernel results..... Not an awful lot of difference
(apart from the native due to the fact 28.6 doesnt have the pacth
included).... 32.rc6 is certainly upto 10% faster on R6

Note we running around 10 test on each, this is a low number for
averages and the result move around 100MB plus.. but in this case we
did not need to be over accurate... they show maybe 20% reduction
writing to FS as opposed to direct to MD.

Whilst the reads on XFS are now 20% 'faster' on XFS than to the raw
device (reaching 2GBs)... .3X seems better at read caching on XFS. I
have only graphed the writes...

Mark



On Thu, Nov 5, 2009 at 7:09 PM, Asdo <asdo@xxxxxxxxxxxxx> wrote:
> Great!
> So the dirty hack pumped at x16 does really work! (while we wait for Jens,
> as written in the patch: "To be reviewed again after Jens' writeback
> changes.") Thanks for having tried up to x32.
> Still Raid-6 xfs write is not yet up to the old speed... maybe the old code
> was better at filling RAID stripes exactly, who knows.
> Mark, yep, personally I would be very interested in seeing how does 2.6.31
> perform on your hardware so I can e.g. see exactly how much my 3ware 9650
> controllers suck... (so also pls try vanilla 3.6.31 which I think has an
> integrated x4 hack, do not just try with x16 please)
> We might also be interested in 2.6.32 performances if you have time, also
> because 2.6.32 includes the fixes for the CPU lockups in big arrays during
> resyncs which was reported on this list, and this is a good incentive for
> upgrading (Neil, btw, is there any chance those lockups fixes get backported
> to mainstream 2.6.31.x?).
> Thank you!
> Asdo
>
>
> mark delfman wrote:
>>
>> Hi Gents,
>>
>> Attached is the result of some testing with the XFS patch... as we can
>> see it does make a reasonable difference!  Changing the value from
>> 4,16,32 shows 16 is a good level...
>>
>> Is this a 'safe' patch at 16?
>>
>> I think that maybe there is still some performance to be gained,
>> especially in the R6 configs which is where most would be interested i
>> suspect.. but its a great start!
>>
>>
>> I think that i should jump up to maybe .31 and see how this reacts.....
>>
>> Neil, i applied your writepage patch and have outputs if these are of
>> interest...
>>
>> Thank you for the help with the pacthing and linux!!!!
>>
>>
>> mark
>>
>>
>>
>> On Wed, Nov 4, 2009 at 5:25 PM, Asdo <asdo@xxxxxxxxxxxxx> wrote:
>>
>>>
>>> Hey great job Neil and Mark
>>> Mark, your benchmarks seems to confirm Neil's analysis: ext2 and ext3 are
>>> not slowed down from 2.6.28.5 and 2.6.28.6
>>> Mark why don't you try to apply the patch below here by Eric Sandeen
>>> found
>>> by Neil to the 2.6.28.6 to see if the xfs write performance comes back?
>>> Thank you for your efforts
>>> Asdo
>>>
>>> mark delfman wrote:
>>>
>>>>
>>>> Some FS comparisons attached in pdf
>>>>
>>>> not sure what to make of them as yet, but worth posting
>>>>
>>>>
>>>> On Tue, Nov 3, 2009 at 12:11 PM, mark delfman
>>>> <markdelfman@xxxxxxxxxxxxxx> wrote:
>>>>
>>>>
>>>>>
>>>>> Thanks Neil,
>>>>>
>>>>> I seem to recall that I tried this on EXT3 and saw the same results as
>>>>> XFS, but with your code and suggestions I think it is well worth me
>>>>> trying some more tests and reporting back....
>>>>>
>>>>>
>>>>> Mark
>>>>>
>>>>> On Tue, Nov 3, 2009 at 4:58 AM, Neil Brown <neilb@xxxxxxx> wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> On Saturday October 31, markdelfman@xxxxxxxxxxxxxx wrote:
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> I am hopeful that you or another member of this group could offer
>>>>>>> some
>>>>>>> advice / patch to implement the print options you suggested... if so
>>>>>>> i
>>>>>>> would happily allocated resource and time to do what i can to help
>>>>>>> with this.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> I've spent a little while exploring this.
>>>>>> It appears to very definitely be an XFS problem, interacting in
>>>>>> interesting ways with the VM.
>>>>>>
>>>>>> I built a 4-drive raid6 and did some simple testing on 2.6.28.5 and
>>>>>> 2.6.28.6 using each of xfs and ext2.
>>>>>>
>>>>>> ext2 gives write throughput of 65MB/sec on .5 and 66MB/sec on .6
>>>>>> xfs gives 86MB/sec on .5 and only 51MB/sec on .6
>>>>>>
>>>>>>
>>>>>> When write_cache_pages is called it calls 'writepage' some number of
>>>>>> times.  On ext2, writepage will write at most one page.
>>>>>> On xfs writepage will sometimes write multiple pages.
>>>>>>
>>>>>> I created a patch as below that prints (in a fairly cryptic way)
>>>>>> the number of 'writepage' calls and the number of pages that XFS
>>>>>> actually wrote.
>>>>>>
>>>>>> For ext2, the number of writepage calls is at most 1536 and averages
>>>>>> around 140
>>>>>>
>>>>>> For xfs with .5, there is usually only one call to writepage and it
>>>>>> writes around 800 pages.
>>>>>> For .6 there are about 200 calls to writepages but the achieve
>>>>>> an average of about 700 pages together.
>>>>>>
>>>>>> So as you can see, there is very different behaviour.
>>>>>>
>>>>>> I notice a more recent patch in XFS in mainline which looks like a
>>>>>> dirty hack to try to address this problem.
>>>>>>
>>>>>> I suggest you try that patch and/or take this to the XFS developers.
>>>>>>
>>>>>> NeilBrown
>>>>>>
>>>>>>
>>>>>>
>>>>>> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>>>>>> index 08d2b96..aa4bccc 100644
>>>>>> --- a/mm/page-writeback.c
>>>>>> +++ b/mm/page-writeback.c
>>>>>> @@ -875,6 +875,8 @@ int write_cache_pages(struct address_space
>>>>>> *mapping,
>>>>>>      int cycled;
>>>>>>      int range_whole = 0;
>>>>>>      long nr_to_write = wbc->nr_to_write;
>>>>>> +       long hidden_writes = 0;
>>>>>> +       long clear_writes = 0;
>>>>>>
>>>>>>      if (wbc->nonblocking && bdi_write_congested(bdi)) {
>>>>>>              wbc->encountered_congestion = 1;
>>>>>> @@ -961,7 +963,11 @@ continue_unlock:
>>>>>>                      if (!clear_page_dirty_for_io(page))
>>>>>>                              goto continue_unlock;
>>>>>>
>>>>>> +                       { int orig_nr_to_write = wbc->nr_to_write;
>>>>>>                      ret = (*writepage)(page, wbc, data);
>>>>>> +                       hidden_writes += orig_nr_to_write -
>>>>>> wbc->nr_to_write;
>>>>>> +                       clear_writes ++;
>>>>>> +                       }
>>>>>>                      if (unlikely(ret)) {
>>>>>>                              if (ret == AOP_WRITEPAGE_ACTIVATE) {
>>>>>>                                      unlock_page(page);
>>>>>> @@ -1008,12 +1014,37 @@ continue_unlock:
>>>>>>              end = writeback_index - 1;
>>>>>>              goto retry;
>>>>>>      }
>>>>>> +
>>>>>>      if (!wbc->no_nrwrite_index_update) {
>>>>>>              if (wbc->range_cyclic || (range_whole && nr_to_write >
>>>>>> 0))
>>>>>>                      mapping->writeback_index = done_index;
>>>>>>              wbc->nr_to_write = nr_to_write;
>>>>>>      }
>>>>>>
>>>>>> +       { static int sum, cnt, max;
>>>>>> +       static unsigned long previous;
>>>>>> +       static int sum2, max2;
>>>>>> +
>>>>>> +       sum += clear_writes;
>>>>>> +       cnt += 1;
>>>>>> +
>>>>>> +       if (max < clear_writes) max = clear_writes;
>>>>>> +
>>>>>> +       sum2 += hidden_writes;
>>>>>> +       if (max2 < hidden_writes) max2 = hidden_writes;
>>>>>> +
>>>>>> +       if (cnt > 100 && time_after(jiffies, previous + 10*HZ)) {
>>>>>> +               printk("write_page_cache: sum=%d cnt=%d max=%d mean=%d
>>>>>> sum2=%d max2=%d mean2=%d\n",
>>>>>> +                      sum, cnt, max, sum/cnt,
>>>>>> +                      sum2, max2, sum2/cnt);
>>>>>> +               sum = 0;
>>>>>> +               cnt = 0;
>>>>>> +               max = 0;
>>>>>> +               max2 = 0;
>>>>>> +               sum2 = 0;
>>>>>> +               previous = jiffies;
>>>>>> +       }
>>>>>> +       }
>>>>>>      return ret;
>>>>>>  }
>>>>>>  EXPORT_SYMBOL(write_cache_pages);
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------
>>>>>> From c8a4051c3731b6db224482218cfd535ab9393ff8 Mon Sep 17 00:00:00 2001
>>>>>> From: Eric Sandeen <sandeen@xxxxxxxxxxx>
>>>>>> Date: Fri, 31 Jul 2009 00:02:17 -0500
>>>>>> Subject: [PATCH] xfs: bump up nr_to_write in xfs_vm_writepage
>>>>>>
>>>>>> VM calculation for nr_to_write seems off.  Bump it way
>>>>>> up, this gets simple streaming writes zippy again.
>>>>>> To be reviewed again after Jens' writeback changes.
>>>>>>
>>>>>> Signed-off-by: Christoph Hellwig <hch@xxxxxxxxxxxxx>
>>>>>> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxxx>
>>>>>> Cc: Chris Mason <chris.mason@xxxxxxxxxx>
>>>>>> Reviewed-by: Felix Blyakher <felixb@xxxxxxx>
>>>>>> Signed-off-by: Felix Blyakher <felixb@xxxxxxx>
>>>>>> ---
>>>>>>  fs/xfs/linux-2.6/xfs_aops.c |    8 ++++++++
>>>>>>  1 files changed, 8 insertions(+), 0 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/xfs/linux-2.6/xfs_aops.c b/fs/xfs/linux-2.6/xfs_aops.c
>>>>>> index 7ec89fc..aecf251 100644
>>>>>> --- a/fs/xfs/linux-2.6/xfs_aops.c
>>>>>> +++ b/fs/xfs/linux-2.6/xfs_aops.c
>>>>>> @@ -1268,6 +1268,14 @@ xfs_vm_writepage(
>>>>>>      if (!page_has_buffers(page))
>>>>>>              create_empty_buffers(page, 1 << inode->i_blkbits, 0);
>>>>>>
>>>>>> +
>>>>>> +       /*
>>>>>> +        *  VM calculation for nr_to_write seems off.  Bump it way
>>>>>> +        *  up, this gets simple streaming writes zippy again.
>>>>>> +        *  To be reviewed again after Jens' writeback changes.
>>>>>> +        */
>>>>>> +       wbc->nr_to_write *= 4;
>>>>>> +
>>>>>>      /*
>>>>>>       * Convert delayed allocate, unwritten or unmapped space
>>>>>>       * to real space and flush out to disk.
>>>>>> --
>>>>>> 1.6.4.3
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>
>>>
>
>

Attachment: XFSvMD_2.pdf
Description: Adobe PDF document


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux