On 10/28/24 18:54, Yu Zhao wrote: > On Mon, Oct 28, 2024 at 5:01 AM Vlastimil Babka <vbabka@xxxxxxx> wrote: >> >> Yes you're right. But since we don't plan to backport it beyond 6.12, >> sorry for sidetracking the discussion unnecessarily. More importantly, >> is it possible to change the implementation as I suggested? > > The only reason I didn't fold account_highatomic_freepages() into > account_freepages() is because the former must be called under the > zone lock, which is also how the latter is called but not as a > requirement. Ah, I guess we can document the requirement/add an lockdep assert. Using __mod_zone_page_state() already implies some context restrictions although not zone lock specifically. > I understand where you come from when suggesting a new per-cpu counter > for free highatomic. I have to disagree with that because 1) free > highatomic is relatively small and drifting might defeat its purpose; > 2) per-cpu memory is among the top kernel memory overhead in our fleet > -- it really adds up. So I prefer not to use per-cpu counters unless > necessary. OK, didn't think of these drawbacks. > So if it's ok with you, I'll just fold account_highatomic_freepages() > into account_freepages(), but keep the counter as per zone, not per > cpu. OK, thanks! >> [1] Hooking >> to __del_page_from_free_list() and __add_to_free_list() means extra work >> in every loop iteration in expand() and __free_one_page(). The >> migratetype hygiene should ensure it's not necessary to intercept every >> freelist add/move and hooking to account_freepages() should be >> sufficient and in line with the intended design. > > Agreed.