On Mon, Jul 21, 2014 at 04:36:51PM +0900, Minchan Kim wrote: > On Mon, Jul 21, 2014 at 03:16:10PM +0900, Gioh Kim wrote: > > > > > > 2014-07-21 오전 11:50, Minchan Kim 쓴 글: > > >Hi Gioh, > > > > > >On Fri, Jul 18, 2014 at 03:45:36PM +0900, Gioh Kim wrote: > > >> > > >>Hi, > > >> > > >>For page migration of CMA, buffer-heads of lru should be dropped. > > >>Please refer to https://lkml.org/lkml/2014/7/4/101 for the history. > > > > > >Just nit: > > >Please write *problem* in description instead of URL link. > > > > > >> > > >>I have two solution to drop bhs. > > >>One is invalidating entire lru. > > > > > >You mean? All of percpu bh_lrus so if the system has N cpu, > > >it invalidates N * 8? > > > > Yes, every bh_lru of all cpus. > > > > > > > >>Another is searching the lru and dropping only one bh that Laura proposed > > >>at https://lkml.org/lkml/2012/8/31/313. > > >> > > >>I'm not sure which has better performance. > > > > > >For whom? system or requestor of CMA? > > > > For system performance. > > > > > > > >>So I did performance test on my cortex-a7 platform with Lmbench > > >>that has "File & VM system latencies" test. > > >>I am attaching the results. > > >>The first line is of invalidating entire lru and the second is dropping selected bh. > > > > > >You mean you did Lmbench with background CMA allocation? > > >Could you describe in detail? > > > > I'm sorry not to mention the background. > > I did the test without CMA allocation because I wanted to check how it affects system performance. > > > > The first test, invalidating entire lru, is adding invalidate_bh_lrus() at alloc_contig_range(). > > This is not affecting system performance because alloc_contig_range() is not called > > for usual file-system management. > > The resulf of the first test is the *default system performance.* > > > > The second test, dropping all bh in lru, is adding drop_buffers(). > > Every call of drop_buffers drops all bhs in lru of every cpu. > > It can affect system performance. *But* it does not affect system performance, > > because it drops only bh of migrated pages. > > > > > > > > > >> > > >>File & VM system latencies in microseconds - smaller is better > > >>------------------------------------------------------------------------------- > > >>Host OS 0K File 10K File Mmap Prot Page 100fd > > >> Create Delete Create Delete Latency Fault Fault selct > > >>--------- ------------- ------ ------ ------ ------ ------- ----- ------- ----- > > >>10.178.33 Linux 3.10.19 25.1 19.6 32.6 19.7 5098.0 0.666 3.45880 6.506 > > >>10.178.33 Linux 3.10.19 24.9 19.5 32.3 19.4 5059.0 0.563 3.46380 6.521 > > >> > > >> > > >>I tried several times but the result tells that they are the same under 1% gap > > >>except Protection Fault. > > >>But the latency of Protection Fault is very small and I think it has little effect. > > >> > > >>Therefore we can choose anything but I choose invalidating entire lru. > > > > > >Not sure we can conclude like that. > > > > > >A few weeks ago, I saw a patch which increases bh_lrus's size. > > >https://lkml.org/lkml/2014/7/4/107 > > >IOW, some of workloads really affects by percpu bh_lrus so it would be > > >better to be careful to drain, I think. > > > > > >You want to argue CMA allocation is rare so the cost is marginable. > > >It might but some of usecase might call it frequently with small request > > >(ie, 8K, 16K). > > > > > >Anyway, why cannot CMA have the cost without affecting other subsystem? > > >I mean it's okay for CMA to consume more time to shoot out the bh > > >instead of simple all bh_lru invalidation because big order allocation is > > >kinds of slow thing in the VM and everybody already know that and even > > >sometime get failed so it's okay to add more code that extremly slow path. > > > > There are 2 reasons to invalidate entire bh_lru. > > > > 1. I think CMA allocation is very rare so that invalidaing bh_lru affects the system little. > > How do you think about it? My platform does not call CMA allocation often. > > Is the CMA allocation or Memory-Hotplug called often? > > It depends on usecase and you couldn't assume anyting because we couldn't > ask every people in the world. "Please ask to us whenever you try to use CMA". > > The key point is how the patch is maintainable. > If it's too complicate to maintain, maybe we could go with simple solution > but if it's not too complicate, we can go with more smart thing to consider > other cases in future. Why not? > > Another point is that how user can detect where the regression is from. > If we cannot notice the regression, it's not a good idea to go with simple > version. > > > > > 2. Adding code in drop_buffers() can affect the system more that adding code in alloc_contig_range() > > because the drop_buffers does not have a way to distinguish migrate type. > > Even-though the lmbech results that it has almost the same performance. > > But I am afraid that it can be changed. > > As you said if bh_lru size can be changed it affects more than now. > > SO I do not want to touch non-CMA related code. > > I'm not saying to add hook in drop_buffers. > What I suggest is to handle failure by bh_lrus in migrate_pages > because it's not a problem only in CMA. > There is already retry logic in migrate_pages so I can think you could > handle it. > > > > > > > > > > >>The try_to_free_buffers() which is calling drop_buffers() is called by many filesystem code. > > >>So I think inserting codes in drop_buffers() can affect the system. > > >>And also we cannot distinguish migration type in drop_buffers(). > > >> > > >>In alloc_contig_range() we can distinguish migration type and invalidate lru if it needs. > > >>I think alloc_contig_range() is proper to deal with bh like following patch. > > >> > > >>Laura, can I have you name on Acked-by line? > > >>Please let me represent my thanks. > > >> > > >>Thanks for any feedback. > > >> > > >>------------------------------- 8< ---------------------------------- > > >> > > >>>From 33c894b1bab9bc26486716f0c62c452d3a04d35d Mon Sep 17 00:00:00 2001 > > >>From: Gioh Kim <gioh.kim@xxxxxxx> > > >>Date: Fri, 18 Jul 2014 13:40:01 +0900 > > >>Subject: [PATCH] CMA/HOTPLUG: clear buffer-head lru before page migration > > >> > > >>The bh must be free to migrate a page at which bh is mapped. > > >>The reference count of bh is increased when it is installed > > >>into lru so that the bh of lru must be freed before migrating the page. > > >> > > >>This frees every bh of lru. We could free only bh of migrating page. > > >>But searching lru costs more than invalidating entire lru. > > >> > > >>Signed-off-by: Gioh Kim <gioh.kim@xxxxxxx> > > >>Acked-by: Laura Abbott <lauraa@xxxxxxxxxxxxxx> > > >>--- > > >> mm/page_alloc.c | 3 +++ > > >> 1 file changed, 3 insertions(+) > > >> > > >>diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > >>index b99643d4..3b474e0 100644 > > >>--- a/mm/page_alloc.c > > >>+++ b/mm/page_alloc.c > > >>@@ -6369,6 +6369,9 @@ int alloc_contig_range(unsigned long start, unsigned long end, > > >> if (ret) > > >> return ret; > > >> > > >>+ if (migratetype == MIGRATE_CMA || migratetype == MIGRATE_MOVABLE) > > >>+ invalidate_bh_lrus(); > > >>+ > > > > > >Q1. It's a only CMA problem? Memory-Hotplug is not a problem? Or other places? > > > > > >I mean it would be better to handle in generic way. > > > > Only CMA and Memory-Hotplug needs it. > > Memory-hotplug uses alloc_contig_range? > You are adding the logic in alloc_contig_range and it is used for > hugetlb and cma. > > > And I think invalidate_bh_lrus() is general. > > It couldn't handle memory-hotplug. > > > > > > > > >Q2. Why do you call it right before calling __alloc_contig_migrate_range? > > > > > >Some of pages will go bh_lrus by __alloc_contig_migrate_ranges. > > >In that case, it is useless without caller's retry logic. > > >Even you do it from caller's retrial logic, it's not a good idea because > > >you makes new binding alloc_contig_range and uppder layer. > > > > > >So, IMHO, it would be better to handle it in migrate_pages. > > >Maybe we could define new API try_to_drop_buffers which calls > > >try_to_free_buffers and then only if the function fails due to > > >percpu lru count, we could drain only the bh in percpu lru list instead of > > >all bh draining. And places in migration path should use it rather than > > >try_to_relese_page. > > > > > >But the problem from this approach invents new API which should be > > >maintained so not sure Andrew think it's worth. > > >Maybe we should see the code and diffstat. > > > > I also consider to making new function, drop_bh_of_migrate_page in migrate_page(), just before unmap_and_move(). > > The migrate_page() has an argument reason that distinguish migrate-type, MR_CMA or MR_MEMORY_HOTPLUG or others. > > Yes, that's what I suggested. If you see -EAGIN, maybe you could do it. > Even, we could enhance it with extending target bh invalidation instead of > all bhs invalidation so you could make two patches. > > 1. use invalidate_bh_lrus in migrate pages > 2. invalidate only failed bh intead of all CPU percpu bh_blrus flushing. Otherwise, 2-1. create try_to_drop_buffers and use it in migration path intead of try_to_release_buffers. > > So, if guys hate 2 which is rather overdesigned, we could drop 2 but 1 is > mergable still. > > > > > But I DO NOT WATN TO touch non-CMA related code. > > Current CMA and Memory-Hotplug code is not mature so that I am not sure it is ok to touch non-CMA related code for CMA/MemoryHotplug. > > > > My point is: > > 1. CMA/Memory-hotplug is rare and invalidating bh-lru is also rare. > > 2. Only change CMA/Memory-hotplig related code. > > > > > > > >Overenginnering? > > > > > >> ret = __alloc_contig_migrate_range(&cc, start, end); > > >> if (ret) > > >> goto done; > > >>-- > > >>1.7.9.5 > > >> > > >>-- > > >>To unsubscribe, send a message with 'unsubscribe linux-mm' in > > >>the body to majordomo@xxxxxxxxx. For more info on Linux MM, > > >>see: http://www.linux-mm.org/ . > > >>Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> > > > > > > > -- > > To unsubscribe, send a message with 'unsubscribe linux-mm' in > > the body to majordomo@xxxxxxxxx. For more info on Linux MM, > > see: http://www.linux-mm.org/ . > > Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> > > -- > Kind regards, > Minchan Kim > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@xxxxxxxxx. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html