On Mon, Aug 17, 2020 at 10:28:49AM +0200, Michal Hocko wrote: > On Mon 17-08-20 00:56:55, Uladzislau Rezki wrote: > [...] > > Michal asked to provide some data regarding how many pages we need and how > > "lockless allocation" behaves when it comes to success vs failed scenarios. > > > > Please see below some results. The test case is a tight loop of 1 000 000 allocations > > doing kmalloc() and kfree_rcu(): > > It would be nice to cover some more realistic workloads as well. > Hmm.. I tried to show syntactic worst case when a "flood" occurs. In such conditions we can get fails what is expectable and we have fallback mechanism for it. > > sudo ./test_vmalloc.sh run_test_mask=2048 single_cpu_test=1 > > > > <snip> > > for (i = 0; i < 1 000 000; i++) { > > p = kmalloc(sizeof(*p), GFP_KERNEL); > > if (!p) > > return -1; > > > > p->array[0] = 'a'; > > kvfree_rcu(p, rcu); > > } > > <snip> > > > > wget ftp://vps418301.ovh.net/incoming/1000000_kmalloc_kfree_rcu_proc_percpu_pagelist_fractio_is_0.png > > If I understand this correctly then this means that failures happen very > often because pcp pages are not recycled quicklly enough. > Yep, it happens and that is kind of worst scenario(flood one). Therefore we have a fallback and is expectable. Also, i did not provide the number of pages in a loop. On my test machine we need approximately ~300/400 pages to cover that flood case until we recycles or return back the pages to the pcp. Please note, as i mentioned before. Our drain part is not optimal for sure, it means that we can rework it a bit making it more efficient. For example, when a flood occurs, instead of delaying "reclaimer logic" thread, it can be placed to a run-queue right away. We can use separate "flush workqueue" that is tagged with WQ_MEM_RECLAIM raising a priority of drain context. i.e. there is a room for reducing such page footprint. > > wget ftp://vps418301.ovh.net/incoming/1000000_kmalloc_kfree_rcu_proc_percpu_pagelist_fractio_is_8.png > > 1/8 of the memory in pcp lists is quite large and likely not something > used very often. > Just for illustration. When percpu_pagelist_fractio is set to 8, i do not see any page fail on a single CPU flood case. If i run simultaneously such flood on all available CPUs there will be fails for sure. > Both these numbers just make me think that a dedicated pool of page > pre-allocated for RCU specifically might be a better solution. I still > haven't read through that branch of the email thread though so there > might be some pretty convincing argments to not do that. > > > Also i would like to underline, that kfree_rcu() reclaim logic can be improved further, > > making the drain logic more efficient when it comes to time, thus to reduce a footprint > > as a result number of required pages. > > > > -- > > Vlad Rezki > > -- > Michal Hocko > SUSE Labs