On Aug 3, 2017 01:07, "Andrew Morton" <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
On Wed, 2 Aug 2017 12:25:05 +0200 Vitaly Wool <vitalywool@xxxxxxxxx> wrote:Has the performance benefit been measured? It's a large patch.
> z3fold is operating on unbuddied lists in a simple manner: in fact,
> it only takes the first entry off the list on a hot path. So if the
> z3fold pool is big enough and balanced well enough, considering
> only the lists local to the current CPU won't be an issue in any
> way, while random I/O performance will go up.
Yes, mostly by running fio in randrw mode. We can see the performance more than doubling on a 8-core ARM64 system.
Why? What are the runtime effects of this change? Does this turn
> This patch also introduces two worker threads which: one for async
> in-page object layout optimization and one for releasing freed
> pages.
currently-synchronous operations into now-async operations? If so,
what are the implications of this if, say, the workqueue doesn't get
serviced for a while?
The biggest benefit is that it usually ends up with one call to compact_page instead of two. Also, we use z3fold as a zram backend and zram likes to free pages on a critical path so removing compaction from this critical path is definitely a nice thing.
If compaction workqueue doesn't get serviced for a significant while, the ratio will go down a bit, no bad things will happen. And z3fold_alloc tries to take new pages from the stale list first, so even if release workqueue is not called, the pages will be reused by z3fold_alloc.
etc. Sorry, but I'm not seeing anywhere near enough information and
testing results to justify merging such a large and intrusive patch.
Thanks,
Vitaly