Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation

"Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> · Wed, 28 Oct 2020 21:03:31 +0000

On Wed, 2020-10-28 at 13:30 +0200, Mike Rapoport wrote:
> On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> > On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P
> > > wrote:
> > > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P
> > > > > wrote:
> > > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > > Indeed, for architectures that define
> > > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > > it is
> > > > > > > possible that __kernel_map_pages() would fail, but since
> > > > > > > this
> > > > > > > function is
> > > > > > > void, the failure will go unnoticed.
> > > > > > 
> > > > > > Could you elaborate on how this could happen? Do you mean
> > > > > > during
> > > > > > runtime today or if something new was introduced?
> > > > > 
> > > > > A failure in__kernel_map_pages() may happen today. For
> > > > > instance, on
> > > > > x86
> > > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > > 
> > > > >         __kernel_map_pages(page, 1, 0);
> > > > > 
> > > > > will need to split, say, 2M page and during the split an
> > > > > allocation
> > > > > of
> > > > > page table could fail.
> > > > 
> > > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break
> > > > a page
> > > > on the direct map and even disables locking in cpa because it
> > > > assumes
> > > > this. If this is happening somehow anyway then we should
> > > > probably fix
> > > > that. Even if it's a debug feature, it will not be as useful if
> > > > it is
> > > > causing its own crashes.
> > > > 
> > > > I'm still wondering if there is something I'm missing here. It
> > > > seems
> > > > like you are saying there is a bug in some arch's, so let's add
> > > > a WARN
> > > > in cross-arch code to log it as it crashes. A warn and making
> > > > things
> > > > clearer seem like good ideas, but if there is a bug we should
> > > > fix it.
> > > > The code around the callers still functionally assume re-
> > > > mapping can't
> > > > fail.
> > > 
> > > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed
> > > the call
> > > that unmaps pages back in safe_copy_page will just reset a 4K
> > > page to
> > > NP because whatever made it NP at the first place already did the
> > > split.
> > > 
> > > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of
> > > a race
> > > between map/unmap dance in __vunmap() and safe_copy_page() that
> > > may
> > > cause access to unmapped memory:
> > > 
> > > __vunmap()
> > >     vm_remove_mappings()
> > >         set_direct_map_invalid()
> > > 					safe_copy_page()	
> > > 					    __kernel_map_pages()
> > > 					    	return
> > > 					    do_copy_page() -> fault
> > > 					   	
> > > This is a theoretical bug, but it is still not nice :) 		
> > > 					
> > 
> > Just to clarify: this patch series fixes this problem, right?
> 
> Yes.
> 

Well, now I'm confused again.

As David pointed, __vunmap() should not be executing simultaneously
with the hibernate operation because hibernate can't snapshot while
data it needs to save is still updating. If a thread was paused when a
page was in an "invalid" state, it should be remapped by hibernate
before the copy.

To level set, before reading this mail, my takeaways from the
discussions on potential hibernate/debug page alloc problems were:

Potential RISC-V issue:
Doesn't have hibernate support

Potential ARM issue:
The logic around when it's cpa determines pages might be unmapped looks
correct for current callers.

Potential x86 page break issue:
Seems to be ok for now, but a new set_memory_np() caller could violate
assumptions in hibernate.

Non-obvious thorny logic: 
General agreement it would be good to separate dependencies.

Behavior of V1 of this patchset:
No functional change other than addition of a warn in hibernate.

So "does this fix the problem", "yes" leaves me a bit confused... Not
saying there couldn't be any problems, especially due to the thorniness
and cross arch stride, but what is it exactly and how does this series
fix it?