On Wed, Oct 30, 2013 at 02:40:06PM +0900, Minchan Kim wrote: >Hello, > >On Tue, Oct 29, 2013 at 09:14:03AM -0700, Laura Abbott wrote: >> On 10/29/2013 8:02 AM, Zhang Mingjun wrote: >> >> > It would move the cost to the CMA paths so I would complain less. Bear >> > in mind as well that forcing everything to go through free_one_page() >> > means that every free goes through the zone lock. I doubt you have any >> > machine large enough but it is possible for simultaneous CMA allocations >> > to now contend on the zone lock that would have been previously fine. >> > Hence, I'm interesting in knowing the underlying cause of the >> > problem you >> > are experiencing. >> > >> >my platform uses CMA but disabled CMA's migration func by del MIGRATE_CMA >> >in fallbacks[MIGRATE_MOVEABLE]. But I find CMA pages can still used by >> >pagecache or page fault page request from PCP list and cma allocation has to >> >migrate these page. So I want to free these cma pages to buddy directly >> >not PCP.. >> > >> > > of course, it will waste the memory outside of the alloc range >> > but in the >> > > pageblocks. >> > > >> > >> > I would hope/expect that the loss would only last for the duration of >> > the allocation attempt and a small amount of memory. >> > >> > > > when a range of pages have been isolated and migrated. Is there any >> > > > measurable benefit to this patch? >> > > > >> > > after applying this patch, the video player on my platform works more >> > > fluent, >> > >> > fluent almost always refers to ones command of a spoken language. I do >> > not see how a video player can be fluent in anything. What is measurably >> > better? >> > >> > For example, are allocations faster? If so, why? What cost from another >> > path is removed as a result of this patch? If the cost is in the PCP >> > flush then can it be checked if the PCP flush was unnecessary and called >> > unconditionally even though all the pages were freed already? We had >> > problems in the past where drain_all_pages() or similar were called >> > unnecessarily causing long sync stalls related to IPIs. I'm wondering if >> > we are seeing a similar problem here. >> > >> > Maybe the problem is the complete opposite. Are allocations failing >> > because there are PCP pages in the way? In that case, it real fix might >> > be to insert a if the allocation is failing due to per-cpu >> > pages. >> > >> >problem is not the allocation failing, but the unexpected cma migration >> >slows >> >down the allocation. >> > >> > >> > > and the driver of video decoder on my test platform using cma >> > alloc/free >> > > frequently. >> > > >> > >> > CMA allocations are almost never used outside of these contexts. While I >> > appreciate that embedded use is important I'm reluctant to see an impact >> > in fast paths unless there is a good reason for every other use case. I >> > also am a bit unhappy to see CMA allocations making the zone->lock >> > hotter than necessary even if no embedded use case it likely to >> > experience the problem in the short-term. >> > >> > -- >> > Mel Gorman >> > SUSE Labs >> > >> > >> >> We've had a similar patch in our tree for a year and a half because >> of CMA migration failures, not just for a speedup in allocation >> time. I understand that CMA is not the fast case or the general use >> case but the problem is that the cost of CMA failure is very high >> (complete failure of the feature using CMA). Putting CMA on the PCP >> lists means they may be picked up by users who temporarily make the >> movable pages unmovable (page cache etc.) which prevents the >> allocation from succeeding. The problem still exists even if the CMA >> pages are not on the PCP list but the window gets slightly smaller. > >I understand that I have seen many people want to use CMA have tweaked >their system to work well and although they do best effort, it doesn't >work well because CMA doesn't gaurantee to succeed in getting free >space since there are lots of hurdle. (get_user_pages, AIO ring buffer, >buffer cache, short of free memory for migration, no swap and so on). >Even, someone want to allocate CMA space with speedy. SIGH. > >Yeah, at the moment, CMA is really SUCK. > >> >> This really highlights one of the biggest issues with CMA today. >> Movable pages make return -EBUSY for any number of reasons. For >> non-CMA pages this is mostly fine, another movable page may be >> substituted for the movable page that is busy. CMA is a restricted >> range though so any failure in that range is very costly because CMA >> regions are generally sized exactly for the use cases at hand which >> means there is very little extra space for retries. >> >> To make CMA actually usable, we've had to go through and add in >> hacks/quirks that prevent CMA from being allocated in any path which >> may prevent migration. I've been mixed on if this is the right path >> or if the definition of MIGRATE_CMA needs to be changed to be more >> restrictive (can't prevent migration). > >Fundamental problem is that every subsystem could grab a page anytime >and they doesn't gaurantee to release it soonish or within time CMA >user want so it turns out non-determisitic mess which just hook into >core MM system here and there. > >Sometime, I see some people try to solve it case by case with ad-hoc >approach. I guess it would be never ending story as kernel evolves. > >I suggest that we could make new wheel with frontswap/cleancache stuff. >The idea is that pages in frontswap/cleancache are evicted from kernel >POV so that we can gaurantee that there is no chance to grab a page >in CMA area and we could remove lots of hook from core MM which just >complicated MM without benefit. > >As benefit, cleancache pages could drop easily so it would be fast >to get free space but frontswap cache pages should be move into somewhere. >If there are enough free pages, it could be migrated out there. Optionally >we could compress them. Otherwise, we could pageout them into backed device. >Yeah, it could be slow than migration but at least, we could estimate the time >by storage speed ideally so we could have tunable knob. If someone want >fast CMA, he could control it with ratio of cleancache:frontswap. >IOW, higher frontswap page ratio is, slower the speed would be. >Important thing is admin could have tuned control knob and it gaurantees to >get CMA free space with deterministic time. > >As drawback, if we fail to tune the ratio, memeory efficieny would be >bad so that it ends up thrashing but you guys is saying we have been >used CMA without movable fallback which means that it's already static >reserved memory and it's never CMA so you already have lost memory >efficiency and even fail to get a space so I think it's good trade-off >for embedded people. > >If anyone has interest the idea, I will move into that. >If it sounds crazy idea, feel free to ignore, please. > Interesting. ;-) Regards, Wanpeng Li >Thanks. > >> >> Thanks, >> Laura >> -- >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, >> hosted by The Linux Foundation >> >> -- >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> the body to majordomo@xxxxxxxxx. For more info on Linux MM, >> see: http://www.linux-mm.org/ . >> Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> > >-- >Kind regards, >Minchan Kim > >-- >To unsubscribe, send a message with 'unsubscribe linux-mm' in >the body to majordomo@xxxxxxxxx. For more info on Linux MM, >see: http://www.linux-mm.org/ . >Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>