> On Jul 8, 2022, at 3:24 PM, Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > > On Fri, Jul 08, 2022 at 07:58:44PM +0000, Song Liu wrote: >> >> >>> On Jul 8, 2022, at 8:58 AM, Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: >>> >>> On Fri, Jul 08, 2022 at 01:36:25AM +0000, Song Liu wrote: >>>> >>>> >>>>> On Jul 7, 2022, at 5:53 PM, Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: >>>>> >>>>> On Thu, Jul 07, 2022 at 11:52:58PM +0000, Song Liu wrote: >>>>>>> On Jul 7, 2022, at 3:59 PM, Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: >>>>>>> >>>>>>> On Thu, Jul 07, 2022 at 03:35:41PM -0700, Song Liu wrote: >>>>>>>> This set is the second half of v4 [1]. >>>>>>>> >>>>>>>> Changes v5 => v6: >>>>>>>> 1. Rebase and extend CC list. >>>>>>> >>>>>>> Why post a new iteration so soon without completing the discussion we >>>>>>> had? It seems like we were at least going somewhere. If it's just >>>>>>> to include mm as I requested, sure, that's fine, but this does not >>>>>>> provide context as to what we last were talking about. >>>>>> >>>>>> Sorry for sending v6 too soon. The primary reason was to extend the CC >>>>>> list and add it back to patchwork (v5 somehow got archived). >>>>>> >>>>>> Also, I think vmalloc_exec_ work would be a separate project, while this >>>>>> set is the followup work of bpf_prog_pack. Does this make sense? >>>>>> >>>>>> Btw, vmalloc_exec_ work could be a good topic for LPC. It will be much >>>>>> more efficient to discuss this in person. >>>>> >>>>> What we need is input from mm / arch folks. What is not done here is >>>>> what that stuff we're talking about is and so mm folks can't guess. My >>>>> preference is to address that. >>>>> >>>>> I don't think in person discussion is needed if the only folks >>>>> discussing this topic so far is just you and me. >>>> >>>> How about we start a thread with mm / arch folks for the vmalloc_exec_* >>>> topic? I will summarize previous discussions and include pointers to >>>> these discussions. If necessary, we can continue the discussion at LPC. >>> >>> This sounds like a nice thread to use as this is why we are talking >>> about that topic. >>> >>>> OTOH, I guess the outcome of that discussion should not change this set? >>> >>> If the above is done right then actually I think it would show similar >>> considerations for a respective free for module_alloc_huge(). >>> >>>> If we have concern about module_alloc_huge(), maybe we can have bpf code >>>> call vmalloc directly (until we have vmalloc_exec_)? >>> >>> You'd need to then still open code in a similar way the same things >>> which we are trying to reach consensus on. >> >>>> What do you think about this plan? >>> >>> I think we should strive to not be lazy and sloppy, and prevent growth >>> of sloppy code. So long as we do that I think this is all reasoanble. >> >> Let me try to understand your concerns here. Say if we want module code >> to be a temporary home for module_alloc_huge before we move it to mm >> code. Would you think it is ready to ship if we: > > Please CC Christoph and linux-modules@xxxxxxxxxxxxxxx on future patches > and dicussions aroudn this, and all others now CC'd. Sometimes, vger drops my patch because the CC list is too long. That's the reason I often trim the CC list. I will try to keep folks in this thread CC'ed. > >> 1) Rename module_alloc_huge as module_alloc_text_huge(); > > module_alloc_text_huge() is too long, but I've suggested names before > which are short and generic, and also suggested that if modules are > not the only users this needs to go outside of modules and so > vmalloc_text_huge() or whatever. > > To do this right it begs the question why we don't do that for the > existing module_alloc(), as the users of this code is well outside of > modules now. Last time a similar generic name was used all the special > arch stuff was left to be done by the module code still, but still > non-modules were still using that allocator. From my perspective the > right thing to do is to deal with all the arch stuff as well in the > generic handler, and have the module code *and* the other users which > use module_alloc() to use that new caller as well. The key difference between module_alloc() and the new API is that the API will return RO+X memory, and the user need text-poke like API to modify this buffer. Archs that do not support text-poke will not be able to use the new API. Does this sound like a reasonable design? > >> 2) Add module_free_text_huge(); > > Right, we have special handling for how we free this special code for regular > module_alloc() and so similar considerations would be needed here for > the huge stuff. > >> 3) Move set_memory_* and fill_ill_insn logic into module_alloc_text_huge() >> and module_free_text_huge(). > > Yes, that's a bit hairy now, and so a saner and consistent way to do > this would be best. Thanks for these information. I will try to go this direction. > >> Are these on the right direction? Did I miss anything important? > > I've also hinted before that another way to help here is to have draw > up a simple lib/test_vmalloc_text.c or something like that which would > enable a selftest to ensure correctness of this code on different archs > and maybe even let you do performance analysis using perf [0]. You have > good reasons to move to the huge allocator and the performance metrics > are an abstract load, however perf measurements can also give you real > raw data which you can reproduce and enable others to do similar > comparisons later. > > The last thing I'd ask is just ensure you Cc folks who have already been in > these discussions. > > [0] https://lkml.kernel.org/r/Yog+d+oR5TtPp2cs@xxxxxxxxxxxxxxxxxxxxxx Let me see how we can test it. Thanks, Song