On Mon, Jun 05, 2023 at 12:20:40PM +0300, Mike Rapoport wrote: > On Fri, Jun 02, 2023 at 10:35:09AM +0100, Mark Rutland wrote: > > On Thu, Jun 01, 2023 at 02:14:56PM -0400, Kent Overstreet wrote: > > > On Thu, Jun 01, 2023 at 05:12:03PM +0100, Mark Rutland wrote: > > > > For a while I have wanted to give kprobes its own allocator so that it can work > > > > even with CONFIG_MODULES=n, and so that it doesn't have to waste VA space in > > > > the modules area. > > > > > > > > Given that, I think these should have their own allocator functions that can be > > > > provided independently, even if those happen to use common infrastructure. > > > > > > How much memory can kprobes conceivably use? I think we also want to try > > > to push back on combinatorial new allocators, if we can. > > > > That depends on who's using it, and how (e.g. via BPF). > > > > To be clear, I'm not necessarily asking for entirely different allocators, but > > I do thinkg that we want wrappers that can at least pass distinct start+end > > parameters to a common allocator, and for arm64's modules code I'd expect that > > we'd keep the range falblack logic out of the common allcoator, and just call > > it twice. > > > > > > > Several architectures override module_alloc() because of various > > > > > constraints where the executable memory can be located and this causes > > > > > additional obstacles for improvements of code allocation. > > > > > > > > > > This set splits code allocation from modules by introducing > > > > > jit_text_alloc(), jit_data_alloc() and jit_free() APIs, replaces call > > > > > sites of module_alloc() and module_memfree() with the new APIs and > > > > > implements core text and related allocation in a central place. > > > > > > > > > > Instead of architecture specific overrides for module_alloc(), the > > > > > architectures that require non-default behaviour for text allocation must > > > > > fill jit_alloc_params structure and implement jit_alloc_arch_params() that > > > > > returns a pointer to that structure. If an architecture does not implement > > > > > jit_alloc_arch_params(), the defaults compatible with the current > > > > > modules::module_alloc() are used. > > > > > > > > As above, I suspect that each of the callsites should probably be using common > > > > infrastructure, but I don't think that a single jit_alloc_arch_params() makes > > > > sense, since the parameters for each case may need to be distinct. > > > > > > I don't see how that follows. The whole point of function parameters is > > > that they may be different :) > > > > What I mean is that jit_alloc_arch_params() tries to aggregate common > > parameters, but they aren't actually common (e.g. the actual start+end range > > for allocation). > > jit_alloc_arch_params() tries to aggregate architecture constraints and > requirements for allocations of executable memory and this exactly what > the first 6 patches of this set do. > > A while ago Thomas suggested to use a structure that parametrizes > architecture constraints by the memory type used in modules [1] and Song > implemented the infrastructure for it and x86 part [2]. > > I liked the idea of defining parameters in a single structure, but I > thought that approaching the problem from the arch side rather than from > modules perspective will be better starting point, hence these patches. > > I don't see a fundamental reason why a single structure cannot describe > what is needed for different code allocation cases, be it modules, kprobes > or bpf. There is of course an assumption that the core allocations will be > the same for all the users, and it seems to me that something like > > * allocate physical memory if allocator caches are empty > * map it in vmalloc or modules address space > * return memory from the allocator cache to the caller > > will work for all usecases. > > We might need separate caches for different cases on different > architectures, and a way to specify what cache should be used in the > allocator API, but that does not contradict a single structure for arch > specific parameters, but only makes it more elaborate, e.g. something like > > enum jit_type { > JIT_MODULES_TEXT, > JIT_MODULES_DATA, > JIT_KPROBES, > JIT_FTRACE, > JIT_BPF, > JIT_TYPE_MAX, > }; Why would we actually need different enums for modules_text, kprobes, ftrace and bpf? Why can't we treat all text allocations the same? The reason we can't do that currently is because modules need to go in a 128Mb region on some archs, and without sub page allocation bpf/kprobes/etc. burn a full page for each allocation. But we're doing sub page allocation - right? That leaves module data - which really needs to be split out into rw, ro, ro_after_init - but I'm not sure we'd even want the same API for those, they need fairly different page permissions handling.