> On Mar 24, 2023, at 6:31 AM, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Fri, Mar 24, 2023 at 06:03:37AM +0000, Song Liu wrote: >> >> >>> On Mar 23, 2023, at 8:30 PM, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: >> >> [...] >> >>> >>> The Uptodate flag check needs to be done by the caller; the >>> find_get_page() family return !uptodate pages. >>> >>> But find_get_page() does not advertise itself as NMI-safe. And I >>> think it's wrong to try to make it NMI-safe. Most of the kernel is >>> not NMI-safe. I think it's incumbent on the BPF people to get the >>> information they need ahead of taking the NMI. NMI handlers are not >>> supposed to be doing a huge amount of work! I don't really understand >>> why it needs to do work in NMI context; surely it can note the location of >>> the fault and queue work to be done later (eg on irq-enable, task-switch >>> or return-to-user) >> >> The use case here is a profiler (similar to perf-record). Parsing the >> build id in side the NMI makes the profiler a lot simpler. Otherwise, >> we will need some post processing for each sample. > > Simpler for you, maybe. But this is an NMI! It's not supposed to > be doing printf-formatting or whatever, much less poking around > in the file cache. Like perf, it should record a sample and then > convert that later. Maybe it can defer to a tasklet, but i think > scheduling work is a better option. > >> OTOH, it is totally fine if build_id_parse() fails some time, say < 5%. >> The profiler output is still useful in such cases. >> >> I guess the next step is to replace find_get_page() with a NMI-safe >> version? > > No, absolutely not. Stop doing so much work in an NMI. While I understand the concern, it is not something we can easily remove, as there are users rely on this feature. How about we discuss this at upcoming LSFMMBPF? Thanks, Song