Re: [patch 23/29] mm, hwpoison: do not lock page again when me_huge_page() successfully recovers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 13, 2021 at 11:23:40AM -0800, Linus Torvalds wrote:
> On Fri, Mar 12, 2021 at 9:08 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
> > Subject: mm, hwpoison: do not lock page again when me_huge_page() successfully recovers
> 
> This patch can not possibly be correct, and is dangerous and very very wrong.
> 
> >  out:
> > -       unlock_page(head);
> > +       if (PageLocked(head))
> > +               unlock_page(head);
> 
> You absolutely CANNOT do things like this. It is fundamentally insane.
> 
> You have two situations:
> 
>  (a) you know you locked the page.
> 
>      In this case an unconditional "unlock_page()" is the only correct
> thing to do.
> 
>  (b) you don't know whether you maybe unlocked it again.
> 
>      In this case SOMEBODY ELSE might have locked the page, and you
> doing "if it's locked, then unlock it" is pure and utter drivel, and
> fundamentally and seriously wrong. You're unlocking SOMEBODY ELSES
> lock, after having already unlocked your own once.
> 
> Now, it is possible that this kind of pattern could be ok if you
> absolutely 100% know - and it is obvious from the code - that you are
> the only thread that can possibly access the page. But if that is the
> case, then the page should never have been locked in the first place,
> because that would be an insane and pointless operation, since the
> whole - and only - point of locking is to enforce some kind of
> exclusivity - and if exclusivity is explicit in the code-path, then
> locking the page is pointless.
> 
> And yes, in this case I can see a remote theoretical model: "all good
> cases keep the page locked, and the only trhing that unlocks the page
> also guarantees nothing else can possiblly see it".
> 
> But no. I don't actually believe this is fuaranteed to the case here,
> and even if it were, I don't think this is a code sequence we can or
> should accept.
> 
> End result: there is no possible situation that I can think of where the above
> 
>        if (PageLocked(head))
>                unlock_page(head);
> 
> kind of sequence can EVER possibly be correct, and it shows a complete
> disregard of everything that locking is all about.
> 
> Now, the memory error handling may be so special that this effectively
> "works" in practice, but there is no way I can apply such a
> fundamentally broken patch.

OK, I'll update the patch with the second approach you mentioned below.

> 
> I see two options:
> 
>  - make the result of identify_page_state() *tell* you whether the
> page was already unlocked (and then make the unlock conditional on
> *that*)
> 
>    This is valid, because now you removed that whole "somebody else
> might have transferred the lock" issue: you know you already unlocked
> the page based on the return value, so you don't unlock it again.
> That's perfectly valid.
> 
>  - probably better: make identify_page_state() itself always unlock
> the page, and turn the two different code sequences that currently do
> 
>         res = identify_page_state(pfn, p, page_flags);
>   out:
>         unlock_page(head);
>         return res;
> 
> into just doing
> 
>         return identify_page_state(pfn, p, page_flags);
>   out:
>         unlock_page(head);
>         return -EBUSY;
> 
> instead, and move that now pointless "res" variable into the only
> if-statement that actually uses it (for other things entirely! It
> should be a "enum mf_result" there!)
> 
> And yes, that second (and imho better) rule means that now
> {page_action()" itself needs to be the thing that unlocks the page,
> and that in turn might be done a few different ways:
> 
>  (a) either add a separate "MF_UNLOCKED" state bit to the possible
> return codes and make that me_huge_page() that unlocks the page set
> that bit (NOTE! It still needs to also return either MF_FAILED or
> MF_RECOVERED, so this "MF_UNLOCKED" bit really should be a separate
> bitmask entirely from the other MF_xyz bits)
> 
>  (b) make the rule be that *all* the ->action() functions need to just
> unlock the page.
> 
> ( suspect (b) is the better model to aim for, because honestly, static
> code rules for locking are basically almost always superior to dynamic
> "based on this flag" locking rules. You can in theory actually have
> static tooling check that the locking nests correctly with the
> unlocking that way (and it's just conceptually simpler to have a hard
> rule about "this function always unlocks the page").

I'll try (b). Thank you for the detailed advices.

- Naoya Horiguchi

> 
> But that existing patch is *so* fundamentally wrong that I cannot
> possibly apply it, and I feel bad about the fact that it has made it
> all the way to me with sign-offs and reviewed-by's and everything.
> 
>                   Linus
> 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux