On Fri 08-08-14 09:26:35, Johannes Weiner wrote: > On Fri, Aug 08, 2014 at 02:32:58PM +0200, Michal Hocko wrote: > > On Thu 07-08-14 11:31:41, Johannes Weiner wrote: [...] > > > although system time is reduced with the high limit. > > > High limit reclaim with SWAP_CLUSTER_MAX has better fault latency but > > > it doesn't actually contain the workload - with 1G high and a 4G load, > > > the consumption at the end of the run is 3.7G. > > > > Wouldn't it help to simply fail the charge and allow the charger to > > fallback for THP allocations if the usage is above high limit too > > much? The follow up single page charge fallback would be still > > throttled. > > This is about defining the limit semantics in unified hierarchy, and > not really the time or place to optimize THP charge latency. > > What are you trying to accomplish here? Well there are two things. The first one is that this patch changes the way how THP are charged for the hard limit without any data to back it up in the changelog. This is the primary concern. The other part is the high limit behavior for large excess. You have chosen to reclaim all excessive charges even when quite a lot of pages might be direct reclaimed. This is potentially dangerous because the excess might be really huge (consider multiple tasks charging THPs simultaneously on many CPUs). Do you really want to direct reclaim nr_online_cpus * 512 pages in the single direct reclaim pass and for all those cpus? This is an extreme case, all right, but the point stays. There has to be a certain cap. Also it seems that the primary source of troubles is THP so the question is. Do we really want to push hard to reclaim enough charges or do we rather fail THP charge and go with single page retry? -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>