On Thu, 7 Apr 2022 19:57:25 +0000 Song Liu <songliubraving@xxxxxx> wrote: > Hi Nicholas and Claudio, > > > On Apr 5, 2022, at 4:54 PM, Song Liu <songliubraving@xxxxxx> wrote: > > > >> On Apr 5, 2022, at 12:07 AM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > >> > >> On Fri, Apr 01, 2022 at 10:22:00PM +0000, Song Liu wrote: > >>>>> Please fix the underlying issues instead of papering over them and > >>>>> creating a huge maintainance burden for others. > >>> > >>> After reading the code a little more, I wonder what would be best strategy. > >>> IIUC, most of the kernel is not ready for huge page backed vmalloc memory. > >>> For example, all the module_alloc cannot work with huge pages at the moment. > >>> And the error Paul Menzel reported in drm_fb_helper.c will probably hit > >>> powerpc with 5.17 kernel as-is? (trace attached below) > >>> > >>> Right now, we have VM_NO_HUGE_VMAP to let a user to opt out of huge pages. > >>> However, given there are so many users of vmalloc, vzalloc, etc., we > >>> probably do need a flag for the user to opt-in? > >>> > >>> Does this make sense? Any recommendations are really appreciated. > >> > >> I think there is multiple aspects here: > >> > >> - if we think that the kernel is not ready for hugepage backed vmalloc > >> in general we need to disable it in powerpc for now. > > > > Nicholas and Claudio, > > > > What do you think about the status of hugepage backed vmalloc on powerpc? > > I found module_alloc and kvm_s390_pv_alloc_vm() opt-out of huge pages. > > But I am not aware of users that benefit from huge pages (except vfs hash, > > which was mentioned in 8abddd968a30). Does an opt-in flag (instead of > > current opt-out flag, VM_NO_HUGE_VMAP) make sense to you? > > Could you please share your comments on this? Specifically, does it make > sense to replace VM_NO_HUGE_VMAP with an opt-in flag? If we think current > opt-out flag is better approach, what would be the best practice to find > all the cases to opt-out? An opt in flag would surely make sense, and it would be more backwards compatible with existing code. That way each user can decide whether to fix the code to allow for hugepages, if possible at all. For example, the case you mentioned for s390 (kvm_s390_pv_alloc_vm) would not be fixable, because of a hardware limitation (the whole area _must_ be mapped with 4k pages) If the consensus were to keep the current opt-put, then I guess each user would have to check each usage of vmalloc and similar, and see if anything breaks. To be honest, I think an opt-out would only be possible after having the opt-in for a (long) while, when most users would have fixed their code. In short, I fully support opt-in. > > Thanks, > Song > > > > Thanks, > > Song > > > >> - if we think even in the longer run only some users can cope with > >> hugepage backed vmalloc we need to turn it into an opt-in in > >> general and not just for x86 > >> - there still to appear various unresolved underlying x86 specific > >> issues that need to be fixed either way > > >