Re: [PATCH v5 18/39] mm: Handle faultless write upgrades for shstk

David Hildenbrand <david@xxxxxxxxxx> · Tue, 31 Jan 2023 09:46:27 +0100

On 28.01.23 01:51, Edgecombe, Rick P wrote:
On Fri, 2023-01-27 at 17:12 +0100, David Hildenbrand wrote:
On 26.01.23 21:19, Edgecombe, Rick P wrote:
On Thu, 2023-01-26 at 09:46 +0100, David Hildenbrand wrote:
On 26.01.23 01:59, Edgecombe, Rick P wrote:
On Wed, 2023-01-25 at 10:43 -0800, Rick Edgecombe wrote:
Thanks for your comments and ideas here, I'll give the:
pte_t pte_mkwrite(struct vm_area_struct *vma, pte_t pte)
...solution a try.

Well, it turns out there are some pte_mkwrite() callers in
other
arch's
that operate on kernel memory and don't have a VMA. So it
needed a
new

Why not pass in NULL as VMA then and document the semantics? The
less
similarly named but slightly different functions, the better :)

Hmm. The x86 and generic versions should probably have the same
semantics, so then if you pass a NULL, it would do a regular
pte_mkwrite() I guess?

I see another benefit of requiring the vma argument, such that raw
pte_mkwrite()s are less likely to appear in core MM code. But I
think
the NULL is awkward because it's not obvious, to me at least, what
the
implications of that should be.

So it will be confusing to read in the NULL cases for the other
archs.
We also have some warnings to catch miss cases in the PTE tear down
code, so the scenario of new code accidentally marking shadow stack
PTEs as writable is not totally unchecked.

The three functions that do slightly different things are:

pte_mkwrite():
Makes a PTE conventionally writable, only takes a PTE. Very clear
that
it is a low level helper and what it does.

maybe_mkwrite():
Might make a PTE writable if the VMA allows it.

pte_mkwrite_vma():
Makes a PTE writable in a specific way depending on the VMA

I wonder if the name pte_mkwrite_vma() is maybe just not clear
enough.
It takes a VMA, yes, but what does it do with it?

What if it was called pte_mkwrite_type() instead? Some arch's have
additional types of writable memory and this function creates them.
Of
course they also have the normal type of writable memory, and
pte_mkwrite() creates that like usual. Doesn't it seem more
readable?

The issue is, the more variants we provide the easier it is to make
mistakes and introduce new buggy code.

It's tempting to simply use pte_mkwrite() and call it a day, where
people actually should use pte_mkwrite_vma().

Then, they at least have to investigate what to do about the second
VMA
parameter.

Ok, I'll give it a spin. So far it looks ok. The downside is the giant
tree-wide pte_mkwrite() signature change, but once that is over with
there are other advantages. Like getting rid of maybe_mkwrite()'s
awareness of shadow stack so the logic is more centralized. Please let
me know if you don't feel comfortable with a suggested-by credit tag.

Sure ...

but I reconsidered :)

Maybe there is a cleaner way to do it and avoid the "NULL" argument.

What about having (while you're going over everything already):

pte_mkwrite(pte, vma)
pte_mkwrite_kernel(pte)

The latter would only be used in that arch code where we're working on 
kernel pgtables. We already have pte_offset_kernel() and 
pte_alloc_kernel_track(), so it's not too weird.

--
Thanks,

David / dhildenb