> On Apr 8, 2022, at 3:08 AM, Claudio Imbrenda <imbrenda@xxxxxxxxxxxxx> wrote: > > On Thu, 7 Apr 2022 19:57:25 +0000 > Song Liu <songliubraving@xxxxxx> wrote: > >> Hi Nicholas and Claudio, >> >>> On Apr 5, 2022, at 4:54 PM, Song Liu <songliubraving@xxxxxx> wrote: >>> >>>> On Apr 5, 2022, at 12:07 AM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: >>>> >>>> On Fri, Apr 01, 2022 at 10:22:00PM +0000, Song Liu wrote: >>>>>>> Please fix the underlying issues instead of papering over them and >>>>>>> creating a huge maintainance burden for others. >>>>> >>>>> After reading the code a little more, I wonder what would be best strategy. >>>>> IIUC, most of the kernel is not ready for huge page backed vmalloc memory. >>>>> For example, all the module_alloc cannot work with huge pages at the moment. >>>>> And the error Paul Menzel reported in drm_fb_helper.c will probably hit >>>>> powerpc with 5.17 kernel as-is? (trace attached below) >>>>> >>>>> Right now, we have VM_NO_HUGE_VMAP to let a user to opt out of huge pages. >>>>> However, given there are so many users of vmalloc, vzalloc, etc., we >>>>> probably do need a flag for the user to opt-in? >>>>> >>>>> Does this make sense? Any recommendations are really appreciated. >>>> >>>> I think there is multiple aspects here: >>>> >>>> - if we think that the kernel is not ready for hugepage backed vmalloc >>>> in general we need to disable it in powerpc for now. >>> >>> Nicholas and Claudio, >>> >>> What do you think about the status of hugepage backed vmalloc on powerpc? >>> I found module_alloc and kvm_s390_pv_alloc_vm() opt-out of huge pages. >>> But I am not aware of users that benefit from huge pages (except vfs hash, >>> which was mentioned in 8abddd968a30). Does an opt-in flag (instead of >>> current opt-out flag, VM_NO_HUGE_VMAP) make sense to you? >> >> Could you please share your comments on this? Specifically, does it make >> sense to replace VM_NO_HUGE_VMAP with an opt-in flag? If we think current >> opt-out flag is better approach, what would be the best practice to find >> all the cases to opt-out? > > An opt in flag would surely make sense, and it would be more backwards > compatible with existing code. That way each user can decide whether to > fix the code to allow for hugepages, if possible at all. For example, > the case you mentioned for s390 (kvm_s390_pv_alloc_vm) would not be > fixable, because of a hardware limitation (the whole area _must_ be > mapped with 4k pages) > > If the consensus were to keep the current opt-put, then I guess each > user would have to check each usage of vmalloc and similar, and see if > anything breaks. To be honest, I think an opt-out would only be > possible after having the opt-in for a (long) while, when most users > would have fixed their code. > > In short, I fully support opt-in. Thanks Claudio! I will prepare patches to replace VM_NO_HUGE_VMAP with an opt-in flag, and use the new flag in BPF. Please let me know any comments/suggestions ont this direction. Song