Re: [RFC PATCH v4 2/4] x86/sgx: Implement support for MADV_WILLNEED

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 27, 2023, Kai Huang wrote:
> > > > 
> > > > You can't build an ABI on assumptions.  E.g. even if userspace *intends* to behave,
> > > > it wouldn't take much to compute the wrong offset (math is hard).
> > > 
> > > But I don't think we have an well established ABI now?  Nothing is documented.
> > 
> > Heh, just because the ABI isn't formally documented doesn't mean it doesn't exist.
> > In fact, the most problematic ABIs are the ones that aren't documented.
> > 
> > > 
> 
> Sure.  But just want to make sure, what is the current SGX driver ABI (around
> mmap()) from your perspective?

To be clear, it's not my perspective, it's simply what the kernel actually does.

> Is it "SGX driver _requires_ pgoff to be 0 for non-ELRANGE-reserve mmap()", or
> "SGX driver _ignores_ pgoff"?

Unless there are checks hiding somewhere, it's the latter.  You might be able to
get away with changing the kernel to require pgoff to be '0', i.e. if literally
all users pass in '0', but proving that all users pass '0' is extremely difficult.
And I don't see any value in requiring pgoff to be '0' for "legacy" users.

> See below ...
> 
> [...]
> 
> > > > 	if (encl->has_mismatched_offsets)  <======
> > > > 		goto unlock;
> > > 
> > > Sorry I am a little bit slow, how do you set "has_mismatched_offsests" to true?
> > 
> > During sgx_mmap().  Though there's a wrinkle I initially missed: if the enclave
> > hasn't gone through ECREATE, encl->base is garbage.  So either sgx_mmap() needs
> > to assume the !CREATED is creating a mismatched offset, or sgx_encl_create() needs
> > to iterate over and check all VMAs.
> > 
> > Since there are advantages to usuing mmap(NULL) to find ELRANGE, IMO your best
> > option is to do the below.  And then that mostly answers the question about
> > using mmap(/dev/sgx_enclave) to reserve ELRANGE, i.e. if userspace wants to use
> > fallocate(), then it effectively *must not* use mmap(/dev/sgx_enclave).
> > 
> > diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c
> > index 262f5fb18d74..63fb41da35aa 100644
> > --- a/arch/x86/kernel/cpu/sgx/driver.c
> > +++ b/arch/x86/kernel/cpu/sgx/driver.c
> > @@ -94,6 +94,11 @@ static int sgx_mmap(struct file *file, struct vm_area_struct *vma)
> >         if (ret)
> >                 return ret;
> >  
> > +       if (!test_bit(SGX_ENCL_CREATED, &encl->flags) ||
> > +           vma->vm_pgoff != PFN_DOWN(vma->vm_start - encl->base))
> > +               encl->has_mismatched_offsests = true;
> > +
> > +
> >         vma->vm_ops = &sgx_vm_ops;
> >         vm_flags_set(vma, VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO);
> >         vma->vm_private_data = encl;
> > 
> 
> It appears we don't need to mark "has_mismatched_offsets" as true if userspace
> uses mmap(/dev/sgx_enclave) to reserve ELRANGE?  Enclave's base isn't
> established yet, so theoretically, from enclave's perspective, we cannot say it
> has a mismatched offset.

Yeah, it's a misnomer.  Just pick a different name and that problem goes away.

	if (!test_bit(SGX_ENCL_CREATED, &encl->flags) ||
	    vma->vm_pgoff != PFN_DOWN(vma->vm_start - encl->base))
		set_bit(SGX_ENCL_DISABLE_FILE_APIS, &encl->flags);

> However we may want to return -EINVAL if userspace passes a non-zero pgoff in
> the mmap(/dev/sgx_enclave) to reserve ELRANGE.

That would potentially break userspace.  I don't personally care if you want to
try that route, but just be aware that pushing through such a change may break
someone's application.  And if that happens, be prepared to get yelled at :-)

> Anyway reserving ELRANGE isn't the big point.  The SGX driver ABI is.  Let's
> forget about reserving ELRANGE in below context, i.e., all mmap() below means
> non-ELRANGE-reserve mmap(/dev/sgx_enclave) :)
> 
> I think yes marking enclave as "has_mismatched_offsets" works if userspace wants
> to use file-based ABIs (fadvice(), etc) for enclave, but this effectively means
> Haitao needs to modify _ALL_ existing mmap()s to pass the correct pgoff in order
> to use fadvice() for EAUG.
> 
> Therefore, it seems more like we are changing ABI (or having a new ABI) to:

It's effectively new ABI.  The key point to all of this is Linus' mantra that
"we don't break userspace".   Since there can't possibly applications using fadvise()
on /dev/sgx_enclave, the kernel can define new requirements for using fadvise()
without breaking userspace.

> Userspace must pass the correct pgoff to all mmaps() in order to use file-based
> ABIs for SGX.
> 
> ( Because I think other drivers/subsystems doesn't have such limitation.  For
> instance, for a normal file, userspace can have two mmap()s mapping to the same
> offset of the same file but with different address, and fadvice() should work
> for both mappings.
>
> A VMA-based "has_mismatched_offset" seems more reasonable for SGX but as I
> mentioned previously it may not easy to do from SGX driver. )
> 
> So I am not sure whether this belongs to "breaking existing ABI"?

I think you're fixated too much on precisely defining the concept of ABI.  The
question that you really want to ask is "could change XYZ break userspace?"

> If the current ABI is we _require_ pgoff being zero, then the "encl-
> >has_mismatched_offsets" seems is breaking this ABI.
> 
> If the current ABI is we _ignore_ pgoff, then "encl->has_mismatched_offsets"
> isn't breaking existing ABI.  But in this case why do we need this at all?

Because SGX has users in the wild that don't set pgoff correctly.  Changing the
kernel to require an accurate pgoff would break those users.

> SGX driver just follows the existing file-based ABIs:
> 
> If userspace wants to use file-based ABIs for some VMA, it needs to pass the
> correct pgoff.  If not, file-based ABIs won't work for that VMA, but your
> enclave may still work if you don't use file-based ABIs.
> 
> Sorry for being noisy, but does above make sense?



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux