On Wed, Dec 11, 2013 at 4:02 AM, Bob Liu <lliubbo@xxxxxxxxx> wrote: > Hi Dan & Seth, > > On Wed, Nov 27, 2013 at 9:28 AM, Dan Streetman <ddstreet@xxxxxxxx> wrote: >> On Mon, Nov 25, 2013 at 1:00 PM, Seth Jennings <sjennings@xxxxxxxxxxxxxx> wrote: >>> On Fri, Nov 22, 2013 at 11:29:16AM -0600, Seth Jennings wrote: >>>> On Wed, Nov 20, 2013 at 02:49:33PM -0500, Dan Streetman wrote: >>>> > Currently, zswap is writeback cache; stored pages are not sent >>>> > to swap disk, and when zswap wants to evict old pages it must >>>> > first write them back to swap cache/disk manually. This avoids >>>> > swap out disk I/O up front, but only moves that disk I/O to >>>> > the writeback case (for pages that are evicted), and adds the >>>> > overhead of having to uncompress the evicted pages, and adds the >>>> > need for an additional free page (to store the uncompressed page) >>>> > at a time of likely high memory pressure. Additionally, being >>>> > writeback adds complexity to zswap by having to perform the >>>> > writeback on page eviction. >>>> > >>>> > This changes zswap to writethrough cache by enabling >>>> > frontswap_writethrough() before registering, so that any >>>> > successful page store will also be written to swap disk. All the >>>> > writeback code is removed since it is no longer needed, and the >>>> > only operation during a page eviction is now to remove the entry >>>> > from the tree and free it. >>>> >>>> I like it. It gets rid of a lot of nasty writeback code in zswap. >>>> >>>> I'll have to test before I ack, hopefully by the end of the day. >>>> >>>> Yes, this will increase writes to the swap device over the delayed >>>> writeback approach. I think it is a good thing though. I think it >>>> makes the difference between zswap and zram, both in operation and in >>>> application, more apparent. Zram is the better choice for embedded where >>>> write wear is a concern, and zswap being better if you need more >>>> flexibility to dynamically manage the compressed pool. >>> >>> One thing I realized while doing my testing was that making zswap >>> writethrough also impacts synchronous reclaim. Zswap, as it is now, >>> makes the swapcache page clean during swap_writepage() which allows >>> shrink_page_list() to immediately reclaim it. Making zswap writethrough >>> eliminates this advantage and swapcache pages must be scanned again >>> before they can be reclaimed, as is the case with normal swapping. >> >> Yep, I thought about that as well, and it is true, but only while >> zswap is not full. With writeback, once zswap fills up, page stores >> will frequently have to reclaim pages by writing compressed pages to >> disk. With writethrough, the zbud reclaim should be quick, as it only >> has to evict the pages, not write them to disk. So I think basically >> writeback should speed up (compared to no-zswap case) swap_writepage() >> while zswap is not full, but (theoretically) slow it down (compared to >> no-zswap case) while zswap is full, while writethrough should slow >> down swap_writepage() slightly (the time it takes to compress/store >> the page) but consistently, almost the same amount before it's full vs >> when it's full. Theoretically :-) Definitely something to think >> about and test for. >> > > Have you gotten any further benchmark result? Yes, and sorry for the delay. The initial numbers I got on a relatively low-end (laptop) system seem to indicate that writethrough does reduce performance in the beginning when zswap isn't full, but also improves performance once zswap fills up. I'm working on getting a higher-end server class system set up to test as well, and getting a larger sample size of test runs (but specjbb takes quite a long time each run). At this point, I'm thinking that based on those results and Weijie's suggestion to make it configurable, it probably is better to keep the writeback code and allow selection of writeback or writethrough. That would allow *possibly* changing from writeback to writethrough based on how full zswap is; but also at the least it would allow users to select which to use. I still think keeping both makes zswap more complex, but moving completely to writethrough may not be best for all situations. I suspect that especially systems with relatively slow disc I/O and fast processors would benefit from writeback, while systems with relatively fast disc I/O (e.g. SSD swap) and slower processors would benefit from writethrough. So, any opinions on keeping both writeback and writethrough, with (for now) a module param to select? I'll send an updated patch if that sounds agreeable to all... The specific specjbb numbers (on a dual core 2.4GHz with 4G ram) I got were: writeback heap in mb bops 3000 38887 3500 39260 4000 38113 4500 15686 5000 10978 5500 1445 6000 1827 writethrough heap in mb bops 3000 39021 3500 35998 4000 36223 4500 7222 5000 7717 5500 2304 6000 2455 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>