On Fri, 27 Apr 2018, Michael S. Tsirkin wrote: > On Thu, Apr 26, 2018 at 05:50:20PM -0400, Mikulas Patocka wrote: > > How is the user or developer supposed to learn about this option, if > > he gets no crash at all? > > Look in /sys/kernel/debug/fail* ? That actually lets you > filter by module, process etc. > > I think this patch conflates two things: > > 1. Make kvmalloc use the vmalloc path. > This seems a bit narrow. > What is special about kvmalloc? IMHO nothing - it's yet another user > of __GFP_NORETRY or __GFP_RETRY_MAYFAIL. As any such __GFP_RETRY_MAYFAIL makes the allocator retry the costly_order allocations > user, it either recovers correctly or not. > So IMHO it's just a case of > making __GFP_NORETRY, __GFP_RETRY_MAYFAIL, or both > fail once in a while. > Seems like a better extension to me than focusing on vmalloc. > I think you will find more bugs this way. If the array is <= PAGE_SIZE, vmalloc will not use __GFP_NORETRY. So it still hides some bugs - such as, if a structure grows above 4k, it would start randomly crashing due to memory fragmentation. > 2. Ability to control this from a separate config > option. > > It's still not that clear to me why is this such a > hard requirement. If a distro wants to force specific > boot time options, why isn't CONFIG_CMDLINE sufficient? There are 489 kernel options declared with the __setup keyword. Hardly any kernel developer notices that a new one was added and selects it when testing his code. > But assuming it's important to control this kind of > fault injection to be controlled from > a dedicated menuconfig option, why not the rest of > faults? The injected faults cause damage to the user, so there's no point to enable them by default. vmalloc fallback should not cause any damage (assuming that the code is correctly written). > IMHO if you split 1/2 up, and generalize, the path upstream > will be much smoother. This seems like a lost case. So, let's not care about code correctness and let's solve crashes only after they are reported. If the upstream wants to work this way, there's nothing that can be done about it. I'm wondering if I can still push it to RHEL or not. > Hope this helps. > > -- > MST Mikulas