On Wed, Nov 06, 2019 at 11:54:38PM +0200, Jarkko Sakkinen wrote: > On Mon, Nov 04, 2019 at 06:17:20PM -0800, Sean Christopherson wrote: > > On Tue, Nov 05, 2019 at 12:26:58AM +0200, Jarkko Sakkinen wrote: > > > On Mon, Nov 04, 2019 at 12:46:02PM -0800, Sean Christopherson wrote: > > > > On Mon, Nov 04, 2019 at 10:01:39PM +0200, Jarkko Sakkinen wrote: > > > > > The reasoning is the same as in > > > > > > > > > > http://git.infradead.org/users/jjs/linux-tpmdd.git/commit/abd55954f91a3aacc1d260d2411cf776ec4d5fd2 > > > > > > > > > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx> > > > > > --- > > > > > arch/x86/kernel/cpu/sgx/ioctl.c | 4 ++-- > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c > > > > > index 5b28a9c0cb68..d53aee5a64c1 100644 > > > > > --- a/arch/x86/kernel/cpu/sgx/ioctl.c > > > > > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c > > > > > @@ -259,7 +259,7 @@ static long sgx_ioc_enclave_create(struct sgx_encl *encl, void __user *arg) > > > > > if (copy_from_user(&ecreate, arg, sizeof(ecreate))) > > > > > return -EFAULT; > > > > > > > > > > - secs_page = alloc_page(GFP_HIGHUSER); > > > > > + secs_page = alloc_page(GFP_KERNEL); > > > > > if (!secs_page) > > > > > return -ENOMEM; > > > > > > > > > > @@ -674,7 +674,7 @@ static long sgx_ioc_enclave_init(struct sgx_encl *encl, void __user *arg) > > > > > if (copy_from_user(&einit, arg, sizeof(einit))) > > > > > return -EFAULT; > > > > > > > > > > - initp_page = alloc_page(GFP_HIGHUSER); > > > > > + initp_page = alloc_page(GFP_KERNEL); > > > > > > > > Would it make sense to use GFP_KERNEL_ACCOUNT? The accounting would be > > > > weird for the case where userspace is using a builder process, but even in > > > > that case it's not flat out wrong to account per-enclave memory allocations. > > > > > > I did not find a single call site that would use that for allocating > > > memory for function-internal data. > > > > Actually, the fact that the allocations are transient is an even better > > argument for accounting the memory, as the weirdness I was referring to > > doesn't exist for the builder concept. > > > > But looking more closely, Documentation/core-api/memory-allocation.rst > > states: > > > > * Untrusted allocations triggered from userspace should be a subject > > of kmem accounting and must have ``__GFP_ACCOUNT`` bit set. There > > is the handy ``GFP_KERNEL_ACCOUNT`` shortcut for ``GFP_KERNEL`` > > allocations that should be accounted. > > > > That means all uses of GFP_KERNEL except in sgx_alloc_epc_section() should > > be converted to GFP_KERNEL_ACCOUNTED. As is, depending on fd limits[*], a > > single process can easily burn through multiple GBs of memory simply by > > opening /dev/sgx/enclave in a loop. > > What does the documentation mean by untrusted allocaton? > > __GFP_ACCOUNT kernel and GFP_KERNEL_ACCOUNT are both quite alien flags > to me as is kmemcg. Things that I know that exist but have never had to > deal with them. > > Looking at the kernel source code they rarely get used. Many drivers > have process bound data structures but none of the drivers use these > flags. I'm wondering why. > > Why sgx_alloc_epc_section() is a use case given that it is something > that allocates memory for the global EPC database? > > > [*] AFAICT, systemd is upping the max number of open files to 1M on my > > systems. I don't _think_ I changed a setting anywhere? Anyway, the tree is now updated: diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 5b82670bb79a..d53aee5a64c1 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -259,7 +259,7 @@ static long sgx_ioc_enclave_create(struct sgx_encl *encl, void __user *arg) if (copy_from_user(&ecreate, arg, sizeof(ecreate))) return -EFAULT; - secs_page = alloc_page(GFP_HIGHUSER); + secs_page = alloc_page(GFP_KERNEL); if (!secs_page) return -ENOMEM; @@ -427,12 +427,20 @@ static int sgx_encl_add_page(struct sgx_encl *encl, if (addp->flags & SGX_PAGE_MEASURE) { ret = __sgx_encl_extend(encl, epc_page); - if (ret) + + /* + * Destroy the enclave if EEXTEND fails, EADD can't be undone. + * Note, destroy() also frees the resources for the added page. + */ + if (ret) { sgx_encl_destroy(encl); - else - sgx_mark_page_reclaimable(encl_page->epc_page); + goto out_unlock; + } } + sgx_mark_page_reclaimable(encl_page->epc_page); + +out_unlock: mutex_unlock(&encl->lock); up_read(¤t->mm->mmap_sem); return ret; @@ -666,7 +674,7 @@ static long sgx_ioc_enclave_init(struct sgx_encl *encl, void __user *arg) if (copy_from_user(&einit, arg, sizeof(einit))) return -EFAULT; - initp_page = alloc_page(GFP_HIGHUSER); + initp_page = alloc_page(GFP_KERNEL); if (!initp_page) return -ENOMEM; Hope that all the updates will be fairly localized :-) /Jarkko