Hi Thomas, On Tue, Nov 29, 2022 at 3:56 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > Song, > > On Tue, Nov 29 2022 at 09:26, Song Liu wrote: > > On Tue, Nov 29, 2022 at 2:23 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > >> Modules are the obvious starting point. Once that is solved pretty much > >> everything else falls into place including BPF. > >> > >> Without modules support this whole exercise is pointless and not going > >> anywhere near x86. > > > > I am not sure I fully understand your point here. Do you mean > > > > 1) There is something wrong with this solution, that makes it not suitable > > for modules; > > or > > 2) The solution is in the right direction and it will very likely work > > for modules. > > But we haven't finished module support. ? > > As I'm obviously unable to express myself coherently, let me try again: > > A solution which solves the BPF problem, but does not solve the > underlying problem of module_alloc() is not acceptable. > > Is that clear enough? While I sincerely want to provide a solution not just for BPF but also for modules and others, I don't think I fully understand the underlying problem of module_alloc(). I sincerely would like to learn more about it. > > > If it is 1), I would like to understand what are the issues that make it not > > suitable for modules. If it is 2), I think a solid, mostly like working small > > step toward the right direction is the better way as it makes code reviews > > a lot easier and has much lower risks. Does this make sense? > > No. Because all you are interested in is to get your BPF itch scratched > instead of actually sitting down and solving the underlying problem and > thereby creating a benefit for everyone. TBH, until your reply, I thought I was working on something that would benefit everyone. It is indeed not just for BPF itch, as bpf_prog_pack already scratched it for BPF. > > You are not making anything easier. You are violating the basic > engineering principle of "Fix the root cause, not the symptom". > I am not sure what is the root cause and the symptom here. I understand ideas referred in this lwn article: https://lwn.net/Articles/894557/ But I don't know which one of them (if any) would fix the root cause. > By doing that you are actually creating more problems than you > solve. Why? > > Clearly your "solution" does not cover the full requirements of the > module space because you solely focus on executable memory allocations > which somehow magically go into the module address space. > > Can you coherently explain how this results in a consistent solution > for the rest of the module requirements? > > Can you coherently explain why this wont create problems down the road > for anyone who actually would be willing to solve the root cause? > > No, you can't answer any of these questions simply because you never > explored the problem space sufficiently. I was thinking, for modules, we only need something new for module text, and module data will just use vmalloc(). I guess this is probably not the right solution? > > I'm not the first one to point this out. Quite some people in the > various threads regarding this issue have been pointing that out to you > before. They even provided you hints on how this can be solved properly > once and forever and for everyones benefits. I tried to review various threads. Unfortunately, I am not able to identify the proper hints and construct a solution. > > > I would also highlight that part of the benefit of this work comes from > > reducing direct map fragmentations. While BPF programs consume less > > memory, they are more dynamic and can cause more direct map > > fragmentations. bpf_prog_pack in upstream kernel already covers this > > part, but this set is a better solution than bpf_prog_pack. > > > > Finally, I would like to point out that 5/6 and 6/6 of (v5) the set let BPF > > programs share a 2MB page with static kernel text. Therefore, even > > for systems without many BPF programs, we should already see some > > reduction in iTLB misses. > > Can you please stop this marketing nonsense? As I pointed out to you in > the very mail which your are replying to, the influence of BPF on the > system I picked randomly out of the pool is pretty close to ZERO. > > Ergo, the reduction of iTLB misses is going to be equally close to > ZERO. What is the benefit you are trying to sell me? > > I'm happy to run perf on this machine and provide the numbers which put > your 'we should already see some reduction' handwaving into perspective. > > But the above is just a distraction. The real point is: > > You can highlight and point out the benefits of your BPF specific > solution as much as you want, it does not make the fact that you are > "fixing" the symptom instead of the root cause magically go away. > > Again for the record: > > The iTLB pressure problem, which affects modules, kprobes, tracing and > BPF, is caused by the way how module_alloc() is implemented. TBH, I don't think I understand this... Do you mean the problem with module_alloc() is that it is not aware of desired permissions (W or X or neither)? If so, is permission vmalloc [1] the right direction for this? [1] https://lwn.net/ml/linux-mm/20201120202426.18009-1-rick.p.edgecombe@xxxxxxxxx/ > > That's the root cause and this needs to be solved for _ALL_ of the users > of this infrastructure and not worked around by adding something which > makes BPF shiny and handwaves about that it solves the underlying > problem. While I did plan to enable 2MB pages for module text, I didn't plan to solve it in the first set. However, since you think it is possible and would like to provide directions, I am up for the challenge and will give it a try. Please share more details about the right direction. Otherwise, I am still lost... Thanks, Song