Re: [LSF/MM TOPIC] Writeback - current state and future

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/06/2011 05:13 PM, Sorin Faibish wrote:
> I was thinking to have a special track for all the writeback related  
> topics.
> I would like also to include a discussion on new cache writeback paterns
> with the target to prevent any cache swaps that are becoming a bigger  
> problem
> when dealing with servers wir 100's GB caches. The swap is the worst that
> could happen to the performance of such systems. I will share my latest  
> findings
> in the cache writeback in continuation to my previous discussion at last  
> LSF.
> 
> /Sorin
> 

Yes, you should try out Wu Fengguang's Latest patches, they fix lots of
what you described in last LSF, and with similar philosophy to what we
talked about. Definitely interesting to see results.

Thanks
Boaz

> On Sun, 06 Feb 2011 05:43:20 -0500, Boaz Harrosh <bharrosh@xxxxxxxxxxx>  
> wrote:
> 
>> On 02/04/2011 06:42 PM, Jan Kara wrote:
>>>   Hi,
>>>
>>>   I'd like to have one session about writeback. The content would highly
>>> depend on the current state of things but on a general level, I'd like  
>>> to
>>> quickly sum up what went into the kernel (or is mostly ready to go)  
>>> since
>>> last LSF (handling of background writeback, livelock avoidance), what is
>>> being worked on - IO-less balance_dirty_pages() (if it won't be in the
>>> mostly done section), what other things need to be improved (kswapd
>>> writeout, writeback_inodes_sb_if_idle() mess, come to my mind now)
>>>
>>> 								Honza
>>
>> Ha, I most certainly want to participate in this talk. I wanted to
>> suggest it myself.
>>
>> Topics that I would like to raise on the matter.
>>
>> [IO-less balance_dirty_pages]
>> As said, I'd really like if Wu or Jan could explain more about the math
>> and IO patterns that went into this tremendous work, and how it should
>> affect us fs maintainers in means of advantages and disadvantages. If
>> digging too deeply into this is not interesting for every body, perhaps
>> a side meeting with fewer people is also possible.
>>
>> [Aligned write-back]
>> I have just finished raid5/6 support in my filesystem and will be sending
>> a patch that tries very aggressively to align IO on stripe boundaries.
>> I did not take the btrfs way of cut/paste of the write_cache_pages()  
>> function
>> to better fit the bill. I used the wbc->nr_to_write to trim down IO on  
>> stripe
>> alignment. Together with some internal structure games, I now have a much
>> better situation then untouched code. Better I mean that if I have simple
>> linear dd IO on a file, I can see o(90%) aligned IOs as opposed to 20%  
>> before
>> that patch. The only remaining issue, I think I have not fully  
>> investigated
>> it yet, is that: because I do not want any residues left from outside the
>> writepages() call so I do not need to sync and lock with flush, and have  
>> a
>> "flushing" flag in my writeout path. So what I still get is that  
>> sometimes
>> the writeback is able to catch up with dd and I get short writes at the
>> reminder, which makes the end of this call and the start of the next call
>> unaligned.
>>
>> I envision a simple BDI members just like ra_pages for readahead that  
>> better
>> govern the writeback chunking. (And is accounted for in the fairness).
>>
>> [Smarter/more cache eviction patterns]
>> I love it when I do a simple dd test in a UML (300Mg of ram) and half  
>> way down
>> I get these fat WARN_ONs of the iscsi tcp writeback failing to allocate  
>> network
>> buffers. And I did lower the writeback ratio a lot because the default  
>> of 20% does
>> not work for a long time, like since 35 or 36. The UML is not the only  
>> affected
>> system any low-memory embedded-like but 64 bit system would be. Now the  
>> IO does
>> complete eventually but the performance is down to 20%.
>>
>> Now for a dd or cp like work pattern I would like the pages be freed  
>> much more
>> aggressively, like right after IO completion because I most certainly  
>> will not
>> use them again. On the other side git for example will write a big  
>> sequential
>> file then immediately turn and read it, so cache presence is a win. But  
>> I think
>> we can still come up with good patterns that take into account the  
>> number of
>> fileh opened on an inode, and some hot inode history to come up with  
>> better
>> patterns. (Some of this history we already have with the security  
>> plugins)
>>
>> And there are other topics that I had, but can remember right now.
>>
>> Thanks
>> Boaz
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel"  
>> in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]