Re: [PATCH] mm: make fault_around_bytes configurable

Vinayak Menon <vinmenon@xxxxxxxxxxxxxx> · Fri, 22 Apr 2016 14:15:08 +0530

On 04/22/2016 05:31 AM, Andrew Morton wrote:
On Mon, 18 Apr 2016 20:47:16 +0530 Vinayak Menon <vinmenon@xxxxxxxxxxxxxx> wrote:

Mapping pages around fault is found to cause performance degradation
in certain use cases. The test performed here is launch of 10 apps
one by one, doing something with the app each time, and then repeating
the same sequence once more, on an ARM 64-bit Android device with 2GB
of RAM. The time taken to launch the apps is found to be better when
fault around feature is disabled by setting fault_around_bytes to page
size (4096 in this case).

Well that's one workload, and a somewhat strange one.  What is the
effect on other workloads (of which there are a lot!).

This workload emulates the way a user would use his mobile device, 
opening an application, using it for some time, switching to next, and 
then coming back to the same application later. Another stat which shows 
significant degradation on Android with fault_around is device boot up 
time. I have not tried any other workload other than these.

The tests were done on 3.18 kernel. 4 extra vmstat counters were added
for debugging. pgpgoutclean accounts the clean pages reclaimed via
__delete_from_page_cache. pageref_activate, pageref_activate_vm_exec,
and pageref_keep accounts the mapped file pages activated and retained
by page_check_references.

=== Without swap ===
                           3.18             3.18-fault_around_bytes=4096
-----------------------------------------------------------------------
workingset_refault        691100           664339
workingset_activate       210379           179139
pgpgin                    4676096          4492780
pgpgout                   163967           96711
pgpgoutclean              1090664          990659
pgalloc_dma               3463111          3328299
pgfree                    3502365          3363866
pgactivate                568134           238570
pgdeactivate              752260           392138
pageref_activate          315078           121705
pageref_activate_vm_exec  162940           55815
pageref_keep              141354           51011
pgmajfault                24863            23633
pgrefill_dma              1116370          544042
pgscan_kswapd_dma         1735186          1234622
pgsteal_kswapd_dma        1121769          1005725
pgscan_direct_dma         12966            1090
pgsteal_direct_dma        6209             967
slabs_scanned             1539849          977351
pageoutrun                1260             1333
allocstall                47               7

=== With swap ===
                           3.18             3.18-fault_around_bytes=4096
-----------------------------------------------------------------------
workingset_refault        597687           878109
workingset_activate       167169           254037
pgpgin                    4035424          5157348
pgpgout                   162151           85231
pgpgoutclean              928587           1225029
pswpin                    46033            17100
pswpout                   237952           127686
pgalloc_dma               3305034          3542614
pgfree                    3354989          3592132
pgactivate                626468           355275
pgdeactivate              990205           771902
pageref_activate          294780           157106
pageref_activate_vm_exec  141722           63469
pageref_keep              121931           63028
pgmajfault                67818            45643
pgrefill_dma              1324023          977192
pgscan_kswapd_dma         1825267          1720322
pgsteal_kswapd_dma        1181882          1365500
pgscan_direct_dma         41957            9622
pgsteal_direct_dma        25136            6759
slabs_scanned             689575           542705
pageoutrun                1234             1538
allocstall                110              26

Looks like with fault_around, there is more pressure on reclaim because
of the presence of more mapped pages, resulting in more IO activity,
more faults, more swapping, and allocstalls.

A few of those things did get a bit worse?
I think some numbers (like workingset, pgpgin, pgpgoutclean etc) looks 
better with fault_around because, increased number of mapped pages is 
resulting in less number of file pages being reclaimed 
(pageref_activate, pageref_activate_vm_exec, pageref_keep above), but 
increased swapping. Latency numbers are far bad with fault_around_bytes 
+ swap, possibly because of increased swapping, decrease in kswapd 
efficiency and increase in allocstalls.
So the problem looks to be that unwanted pages are mapped around the 
fault and page_check_references is unaware of this.

Do you have any data on actual wall-time changes?  How much faster do
things become with the patch?  If it is "0.1%" then I'd say "umm, no".

=== Without swap ====
                          3.18         3.18-fault_around_bytes=4096
Avg launch latency        1695ms       1300ms (23.3%)
Max launch latency        5097ms       3135ms (38.49%)

Make fault_around_bytes configurable so that it can be tuned to avoid
performance degradation.

It sounds like we need to be smarter about auto-tuning this thing.
Maybe the refault code could be taught to provide the feedback path but
that sounds hard.

Still.  I do think it would be better to make this configurable at
runtime.  Move the existing debugfs tunable into /proc/sys/vm (and
document it!).  I do dislkie adding even more tunables but this one
does make sense.  People will want to run their workloads with various
values until they find the peak throughput, and requiring a kernel
rebuild for that is a huge pain.

I can send a v2 to do this runtime via /proc/sys/vm.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>