On Wed, May 11, 2022 at 08:44:43PM -0700, Minchan Kim wrote: > On Wed, May 11, 2022 at 07:18:56PM -0700, John Hubbard wrote: > > On 5/11/22 18:08, John Hubbard wrote: > > > On 5/11/22 18:03, Minchan Kim wrote: > > > > > > > > > > Or there might be some code path that really hates a READ_ONCE() in > > > > > that place. > > > > > > > > My worry about chaning __get_pfnblock_flags_mask is it's called > > > > multiple hot places in mm codes so I didn't want to add overhead > > > > to them. > > > > > > ...unless it really does generate the same code as is already there, > > > right? Let me check that real quick. > > > > > > > It does change the generated code slightly. I don't know if this will > > affect performance here or not. But just for completeness, here you go: > > > > free_one_page() originally has this (just showing the changed parts): > > > > mov 0x8(%rdx,%rax,8),%rbx > > and $0x3f,%ecx > > shr %cl,%rbx > > and $0x7, > > > > > > And after applying this diff: > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 0e42038382c1..df1f8e9a294f 100644 > > +++ b/mm/page_alloc.c > > @@ -482,7 +482,7 @@ unsigned long __get_pfnblock_flags_mask(const struct > > page *page, > > word_bitidx = bitidx / BITS_PER_LONG; > > bitidx &= (BITS_PER_LONG-1); > > > > - word = bitmap[word_bitidx]; > > + word = READ_ONCE(bitmap[word_bitidx]); > > return (word >> bitidx) & mask; > > } > > > > > > ...it now does an extra memory dereference: > > > > lea 0x8(%rdx,%rax,8),%rax > > and $0x3f,%ecx > > mov (%rax),%rbx > > shr %cl,%rbx > > and $0x7,%ebx Where is the extra memory reference? 'lea' is not a memory reference, it is just some maths? > Thanks for checking, John. > > I don't want to have the READ_ONCE in __get_pfnblock_flags_mask > atm even though it's an extra memory dereference for specific > architecutre and specific compiler unless other callsites *do* > need it. If a callpath is called under locking or not under locking then I would expect to have two call chains clearly marked what their locking conditions are. ie __get_pfn_block_flags_mask_unlocked() - and obviously clearly document and check what the locking requirements are of the locked path. IMHO putting a READ_ONCE on something that is not a memory load from shared data is nonsense - if a simple == has a stability risk then so does the '(word >> bitidx) & mask'. Jason