On Tue, Mar 23, 2021 at 08:50:53AM +0100, Michal Hocko wrote: > > >> +static inline unsigned long min_hp_count(struct hstate *h, unsigned long count) > > >> +{ > > >> + unsigned long min_count; > > >> + > > >> + min_count = h->resv_huge_pages + h->nr_huge_pages - h->free_huge_pages; > > >> + return max(count, min_count); > > > > > > Just out of curiousity, is compiler allowed to inline this piece of code > > > and then cache the value? In other words do we need to make these > > > READ_ONCE or otherwise enforce the no-caching behavior? > > > > I honestly do not know if the compiler is allowed to do that. The > > assembly code generated by my compiler does not cache the value, but > > that does not guarantee anything. I can add READ_ONCE to make the > > function look something like: > > > > static inline unsigned long min_hp_count(struct hstate *h, unsigned long count) > > { > > unsigned long min_count; > > > > min_count = READ_ONCE(h->resv_huge_pages) + READ_ONCE(h->nr_huge_pages) > > - READ_ONCE(h->free_huge_pages); > > return max(count, min_count); > > } > > Maybe just forcing to never inline the function should be sufficient. > This is not a hot path to micro optimize for no function call. But there > are much more qualified people on the CC list on this matter who could > clarify. Peter? I'm not sure I understand the code right. But inline or not doesn't matter, LTO completely ruins that game. Just like if it was a static function, then the compiler is free to inline it, even if the function lacks an inline attribute. Basically, without READ_ONCE() the compiler is allowed to entirely elide the load (and use a previous load), or to duplicate the load and do it again later (reaching a different result). Similarly, the compiler is allowed to byte-wise load the variable in any random order and re-assemble. If any of that is a problem, you have to use READ_ONCE().