Re: kernel BUG at include/linux/swapops.h:204!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 19 Jul 2021, Peter Xu wrote:
> On Mon, Jul 19, 2021 at 03:42:41PM -0700, Hugh Dickins wrote:
> > On Mon, 19 Jul 2021, Peter Xu wrote:
> > > On Mon, Jul 19, 2021 at 12:11:21PM -0700, Hugh Dickins wrote:
> > > > 
> > > > But I'm confident that 8f34f1eac382 will prove to be the fix, so Peter
> > > > please prepare some backports of that for the various stable/longterm
> > > > kernels that need it - I've not looked into whether it applies cleanly,
> > > > or depends on other commits too.  You fixed several related but different
> > > > things in that commit: but this one is the worst, because it can corrupt
> > > > even those who are not using UFFD_WP at all.
> > > 
> > > Looks right to me, b569a1760782 ("userfaultfd: wp: drop _PAGE_UFFD_WP properly
> > > when fork", 2020-04-07) seems to be the culprit.  I didn't notice the side
> > > effect in the bug or in the fix, or it should have already land stables. I am
> > > very sorry for such a preliminary bug that caused this fallout - I really can't
> > > tell why I completely didn't look at is_swap_pte() that's so obvious indeed.
> > > 
> > > I checked it up, 5.6.y doesn't have the issue commit yet as it's not marked as
> > > "fixes". It started to show up in 5.7.y~5.13.y. 5.14-rc1 has 8f34f1eac382 which
> > > is the fix.  So I think we need the fix or equivalent fix for 5.7.y~5.13.y.
> > > 
> > > 5.12.y & 5.13.y can pick up the fix 8f34f1eac382 cleanly.  For the olders
> > > (5.7.y~5.11.y) they can't.  I plan to revert b569a1760782 instead.
> > > 
> > ...
> > > 
> > > Please let me know if there's any comment on the backport plan above, or I'll
> > > prepare the patches for all the branches before tomorrow.
> > 
> > Thanks for getting on to it so quickly, Peter.
> > 
> > The only non-EOL stable/longterm releases are then 5.13, 5.12 and 5.10.
> 
> I see, thanks.  I haven't explicitly backported patches to stable; it's a good
> chance to learn about the bits, though.
> 
> > 
> > I have no appreciation of the importance of UFFD_EVENT_FORK support
> > for uffd-wp.  And no appreciation of the importance of the other bugs
> > you fixed in 8f34f1eac382, and other uffd-wp fixes you may have made
> > recently, some backported, some not.
> > 
> > But I think it is worth giving 5.10, the longterm, a little more
> > consideration: don't be driven by whether 8f34f1eac382 applies cleanly
> > (all 5.13 and 5.12 would require then is a mail to GregKH Cc stable
> > asking him to add that commit), but by how important the support is
> > to users of 5.10, and how far away from working safely it is - maybe
> > a 5.10-specific patch would be worthwhile, maybe not, I cannot judge.
> 
> I am not aware of anyone who's using fork with uffd-wp.  CRIU is the major user
> per my knowledge that uses uffd fork, but still it shouldn't be using uffd-wp.
> I know other users that use uffd-wp, but never with fork event.

Okay. It does run a risk, that an unknown 5.10 longterm user is relying
on uffd-wp with uffd-fork, and the revert will break their usage in what
is supposed to be a stable kernel. But I'll let you be the judge of that.

> 
> Per my understanding, if above is true then it's probably not a good candidate
> for such patches that fixing uffd-wp + fork to be backported to stable, as in
> stable tree rules there's one entry:
> 
>   - It must fix a real bug that bothers people (not a, “This could be a
>     problem…” type thing).

BUG_ON(!PageLocked) is a real bug that bothered Igor (thougn not on 5.10).
I cannot estimate the seriousness of the other fixes in that commit.

> 
> https://www.kernel.org/doc/html/v5.13/process/stable-kernel-rules.html

I have not re-read those rules recently, but last time I questioned them,
the conclusion seemed to be that they're reasons the stable guys can give
for refusing a patch, but generally nowadays they put in much more than
fits those rules; to the extent that akpm has to restrain them from time
to time, insisting that mm patches for stable be his decision.

> 
> That's also one reason I didn't add Fixes for some patches because I am not
> sure whether that'll help anyone, and the worst case is if someone hit some
> issue we can backport explicitly.

Putting in Fixes generally helps, but whether then to Cc stable will
usually be akpm's decision based on our input.  In the case of a BUG
that hit a real user, I feel fairly confident predicting his decision.

> 
> There're a few other things that made me a bit worried before backporting the
> full patch, even for 5.12/5.13, as there're requirements on the patch:
> 
>   - It cannot be bigger than 100 lines, with context.
>   - It must fix only one thing.
> 
> The full patch (after squashed) contains ~200 LOC with context, and it does fix
> a few more things... Do you know whether that would be a problem?

I doubt it. But I'd have to spend too long reading the git-log man-page
to learn what options to give it to find an example in stable.git to
reassure you.

I'd worry much more about the quality of your commit than its size.
If you're uneasy about that now, then it should not have gone into
5.14-rc.  If you're confident in it, then I'd say it's good for stable:
unless you suspect interdependent fixes, some of which may already be
there in 5.10.y (not always by akpm's choice), some of which not: that
can get messy. Applying cleanly is not the most important criterion.

> 
> I'm not sure how strict would stable branches be, but if that's one blocker
> then we'll only be able to either pick up the real fix for copy_pmd_range() or
> just revert the issue commit which is just a few more lines; that should also
> keep the tree cleaner.  From that sense, maybe it's easier to just revert for
> all the branches (5.10/5.12/5.13).  Then if someone broke with uffd-wp+fork, I
> can backport the full patch with better reasoning.

5.10 is a long-lasting foundation for many, considerably more important
than 5.12 and 5.13 which will likely reach EOL soon: so yes, if you have
any doubts about 8f34f1eac382, or the way it will fit into earlier releases,
and think a revert of uffd-wp+fork is safer, that may be the right thing
for all those releases.

I feel like I've used a lot of words to say nothing: back to patches!

Hugh

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux