On Fri, 2023-01-27 at 10:29 +0100, Michal Hocko wrote: > On Fri 27-01-23 04:35:22, Leonardo Brás wrote: > > On Fri, 2023-01-27 at 08:20 +0100, Michal Hocko wrote: > > > On Fri 27-01-23 04:14:19, Leonardo Brás wrote: > > > > On Thu, 2023-01-26 at 15:12 -0800, Roman Gushchin wrote: > > > [...] > > > > > I'd rather opt out of stock draining for isolated cpus: it might slightly reduce > > > > > the accuracy of memory limits and slightly increase the memory footprint (all > > > > > those dying memcgs...), but the impact will be limited. Actually it is limited > > > > > by the number of cpus. > > > > > > > > I was discussing this same idea with Marcelo yesterday morning. > > > > > > > > The questions had in the topic were: > > > > a - About how many pages the pcp cache will hold before draining them itself? > > > > > > MEMCG_CHARGE_BATCH (64 currently). And one more clarification. The cache > > > doesn't really hold any pages. It is a mere counter of how many charges > > > have been accounted for the memcg page counter. So it is not really > > > consuming proportional amount of resources. It just pins the > > > corresponding memcg. Have a look at consume_stock and refill_stock > > > > I see. Thanks for pointing that out! > > > > So in worst case scenario the memcg would have reserved 64 pages * (numcpus - 1) > > s@numcpus@num_isolated_cpus@ I was thinking worst case scenario being (ncpus - 1) being isolated. > > > that are not getting used, and may cause an 'earlier' OOM if this amount is > > needed but can't be freed. > > s@OOM@memcg OOM@ > > In the wave of worst case, supposing a big powerpc machine, 256 CPUs, each > > holding 64k * 64 pages => 1GB memory - 4MB (one cpu using resources). > > It's starting to get too big, but still ok for a machine this size. > > It is more about the memcg limit rather than the size of the machine. > Again, let's focus on actual usacase. What is the usual memcg setup with > those isolcpus I understand it's about the limit, not actually allocated memory. When I point the machine size, I mean what is expected to be acceptable from a user in that machine. > > > The thing is that it can present an odd behavior: > > You have a cgroup created before, now empty, and try to run given application, > > and hits OOM. > > The application would either consume those cached charges or flush them > if it is running in a different memcg. Or what do you have in mind? 1 - Create a memcg with a VM inside, multiple vcpus pinned to isolated cpus. 2 - Run multi-cpu task inside the VM, it allocates memory for every CPU and keep the pcp cache 3 - Try to run a single-cpu task (pinned?) inside the VM, which uses almost all the available memory. 4 - memcg OOM. Does it make sense? > > > You then restart the cgroup, run the same application without an issue. > > > > Even though it looks a good possibility, this can be perceived by user as > > instability. > > > > > > > > > b - Would it cache any kind of bigger page, or huge page in this same aspect? > > > > > > The above should answer this as well as those following up I hope. If > > > not let me know. > > > > IIUC we are talking normal pages, is that it? > > We are talking about memcg charges and those have page granularity. > Thanks for the info! Also, thanks for the feedback! Leo