On Wed, Nov 06, 2019 at 11:59:42PM +0200, Jarkko Sakkinen wrote: > On Wed, Nov 06, 2019 at 11:54:38PM +0200, Jarkko Sakkinen wrote: > > On Mon, Nov 04, 2019 at 06:17:20PM -0800, Sean Christopherson wrote: > > > On Tue, Nov 05, 2019 at 12:26:58AM +0200, Jarkko Sakkinen wrote: > > > > On Mon, Nov 04, 2019 at 12:46:02PM -0800, Sean Christopherson wrote: > > > > > On Mon, Nov 04, 2019 at 10:01:39PM +0200, Jarkko Sakkinen wrote: > > > > > > The reasoning is the same as in > > > > > > > > > > > > http://git.infradead.org/users/jjs/linux-tpmdd.git/commit/abd55954f91a3aacc1d260d2411cf776ec4d5fd2 > > > > > > > > > > > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx> > > > > > > --- > > > > > > arch/x86/kernel/cpu/sgx/ioctl.c | 4 ++-- > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c > > > > > > index 5b28a9c0cb68..d53aee5a64c1 100644 > > > > > > --- a/arch/x86/kernel/cpu/sgx/ioctl.c > > > > > > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c > > > > > > @@ -259,7 +259,7 @@ static long sgx_ioc_enclave_create(struct sgx_encl *encl, void __user *arg) > > > > > > if (copy_from_user(&ecreate, arg, sizeof(ecreate))) > > > > > > return -EFAULT; > > > > > > > > > > > > - secs_page = alloc_page(GFP_HIGHUSER); > > > > > > + secs_page = alloc_page(GFP_KERNEL); > > > > > > if (!secs_page) > > > > > > return -ENOMEM; > > > > > > > > > > > > @@ -674,7 +674,7 @@ static long sgx_ioc_enclave_init(struct sgx_encl *encl, void __user *arg) > > > > > > if (copy_from_user(&einit, arg, sizeof(einit))) > > > > > > return -EFAULT; > > > > > > > > > > > > - initp_page = alloc_page(GFP_HIGHUSER); > > > > > > + initp_page = alloc_page(GFP_KERNEL); > > > > > > > > > > Would it make sense to use GFP_KERNEL_ACCOUNT? The accounting would be > > > > > weird for the case where userspace is using a builder process, but even in > > > > > that case it's not flat out wrong to account per-enclave memory allocations. > > > > > > > > I did not find a single call site that would use that for allocating > > > > memory for function-internal data. > > > > > > Actually, the fact that the allocations are transient is an even better > > > argument for accounting the memory, as the weirdness I was referring to > > > doesn't exist for the builder concept. > > > > > > But looking more closely, Documentation/core-api/memory-allocation.rst > > > states: > > > > > > * Untrusted allocations triggered from userspace should be a subject > > > of kmem accounting and must have ``__GFP_ACCOUNT`` bit set. There > > > is the handy ``GFP_KERNEL_ACCOUNT`` shortcut for ``GFP_KERNEL`` > > > allocations that should be accounted. > > > > > > That means all uses of GFP_KERNEL except in sgx_alloc_epc_section() should > > > be converted to GFP_KERNEL_ACCOUNTED. As is, depending on fd limits[*], a > > > single process can easily burn through multiple GBs of memory simply by > > > opening /dev/sgx/enclave in a loop. > > > > What does the documentation mean by untrusted allocaton? > > > > __GFP_ACCOUNT kernel and GFP_KERNEL_ACCOUNT are both quite alien flags > > to me as is kmemcg. Things that I know that exist but have never had to > > deal with them. > > > > Looking at the kernel source code they rarely get used. Many drivers > > have process bound data structures but none of the drivers use these > > flags. I'm wondering why. > > > > Why sgx_alloc_epc_section() is a use case given that it is something > > that allocates memory for the global EPC database? > > > > > [*] AFAICT, systemd is upping the max number of open files to 1M on my > > > systems. I don't _think_ I changed a setting anywhere? > > Anyway, the tree is now updated: > > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c > index 5b82670bb79a..d53aee5a64c1 100644 > --- a/arch/x86/kernel/cpu/sgx/ioctl.c > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c > @@ -259,7 +259,7 @@ static long sgx_ioc_enclave_create(struct sgx_encl *encl, void __user *arg) > if (copy_from_user(&ecreate, arg, sizeof(ecreate))) > return -EFAULT; > > - secs_page = alloc_page(GFP_HIGHUSER); > + secs_page = alloc_page(GFP_KERNEL); > if (!secs_page) > return -ENOMEM; > > @@ -427,12 +427,20 @@ static int sgx_encl_add_page(struct sgx_encl *encl, > > if (addp->flags & SGX_PAGE_MEASURE) { > ret = __sgx_encl_extend(encl, epc_page); > - if (ret) > + > + /* > + * Destroy the enclave if EEXTEND fails, EADD can't be undone. > + * Note, destroy() also frees the resources for the added page. > + */ > + if (ret) { > sgx_encl_destroy(encl); > - else > - sgx_mark_page_reclaimable(encl_page->epc_page); > + goto out_unlock; > + } > } > > + sgx_mark_page_reclaimable(encl_page->epc_page); > + > +out_unlock: > mutex_unlock(&encl->lock); > up_read(¤t->mm->mmap_sem); > return ret; > @@ -666,7 +674,7 @@ static long sgx_ioc_enclave_init(struct sgx_encl *encl, void __user *arg) > if (copy_from_user(&einit, arg, sizeof(einit))) > return -EFAULT; > > - initp_page = alloc_page(GFP_HIGHUSER); > + initp_page = alloc_page(GFP_KERNEL); > if (!initp_page) > return -ENOMEM; > > Hope that all the updates will be fairly localized :-) Also removed some patches on top that were pushed by accident (patches under review). /Jarkko