On Sun, Dec 18, 2022 at 06:01:09PM +0800, Yafang Shao wrote: > On Sun, Dec 18, 2022 at 4:08 PM Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> wrote: > > > > On Sat, Dec 17, 2022 at 10:58:31AM +0000, Yafang Shao wrote: > > > On 64bit system, page extension is for debugging purpose only currently, > > > because of its overhead, in particular the memory overhead. > > > > > > Once a page_ext is enabled, at least it will take 0.2% of the total > > > memory because of the page_ext flags, no matter this page_ext uses it or > > > not. Currently this page_ext flags is only used for page_owner on 64bit > > > system. So it doesn't make sense to allocate this flags for all page_ext > > > by default. We'd better move it into page_owner's structure, then when > > > someone wants to introduce a new page_ext which may be memory-overhead > > > sensitive, it will save this unneeded overhead. > > > > Hi Hyeonggon, Hi Yafang. > > I'm still worried about the approach of using page_ext for BPF memory > > accounting. > > > > This patchset is independent of BPF memory accounting. It is just an > improvement for the page_ext itself. this improvement doesn't make sense until we have non-debugging/memory overhead sensitive use case of page_ext. > > Let's say we finally accepted the approach of using page_ext for BPF memory > > accounting, and if it's not enabled by default, you need to rebuild the kernel > > when you want to check BPF memory usage. (Or make everyone bear the overhead > > by enabling page_ext default for BPF memory accounting that one may not be interested) > > > > The compile config can be enabled by default, but the static key is > disabled by default. That said, the user doesn't need to rebuild the > kernel. The user can pass a kernel parameter to enable or disable it. > But that doesn't mean I have made a final decision to use page_ext to > account for it. Okay. > I will investigate if the dynamic page_ext can be > introduced or not first. If we can allocate page_ext dynamically > rather than allocate them at boot, the page_ext will be used in the > production environment rather than for debugging purposes only, that > will make it more useful. IMHO that's gonna be quite challenging.... To get page_ext using pfn, we need array of page_ext. And in FLATMEM size of the array can be bigger than maximum allocation size supported by buddy allocator. (that's why page_ext uses early memory allocator, memblock in FLATMEM and memblock is unavailable after boot process) Why challenge page_ext if call_rcu() can solve the problem? > > How the problem of accounting kfree_rcu() is going? > > Doesn't call_rcu() work for the problem? > > I still think it's much more feasible. > > > > I'm also investigating the call_rcu() solution. The disadvantage of > call_rcu(), as I have explained in earlier replyment, is that it adds > restrictions for this solution What kind of restrictions did you mean? > and it can be broken easily if MM guys > introduce other deferred freeing functions inside mm/. That will be a > burden for further development. No, if call_rcu() is called in BPF subsystem and if the callback just accounts memory usage & calls memory free API (like kfree()/vfree()/etc), this is not going to be burden for MM developers at all. Also, even if it is considered as burden, it can be justified because it avoids increasing memory usage. -- Thanks, Hyeonggon