On Mon, Nov 04, 2019 at 06:17:20PM -0800, Sean Christopherson wrote: > On Tue, Nov 05, 2019 at 12:26:58AM +0200, Jarkko Sakkinen wrote: > > On Mon, Nov 04, 2019 at 12:46:02PM -0800, Sean Christopherson wrote: > > > On Mon, Nov 04, 2019 at 10:01:39PM +0200, Jarkko Sakkinen wrote: > > > > The reasoning is the same as in > > > > > > > > http://git.infradead.org/users/jjs/linux-tpmdd.git/commit/abd55954f91a3aacc1d260d2411cf776ec4d5fd2 > > > > > > > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx> > > > > --- > > > > arch/x86/kernel/cpu/sgx/ioctl.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c > > > > index 5b28a9c0cb68..d53aee5a64c1 100644 > > > > --- a/arch/x86/kernel/cpu/sgx/ioctl.c > > > > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c > > > > @@ -259,7 +259,7 @@ static long sgx_ioc_enclave_create(struct sgx_encl *encl, void __user *arg) > > > > if (copy_from_user(&ecreate, arg, sizeof(ecreate))) > > > > return -EFAULT; > > > > > > > > - secs_page = alloc_page(GFP_HIGHUSER); > > > > + secs_page = alloc_page(GFP_KERNEL); > > > > if (!secs_page) > > > > return -ENOMEM; > > > > > > > > @@ -674,7 +674,7 @@ static long sgx_ioc_enclave_init(struct sgx_encl *encl, void __user *arg) > > > > if (copy_from_user(&einit, arg, sizeof(einit))) > > > > return -EFAULT; > > > > > > > > - initp_page = alloc_page(GFP_HIGHUSER); > > > > + initp_page = alloc_page(GFP_KERNEL); > > > > > > Would it make sense to use GFP_KERNEL_ACCOUNT? The accounting would be > > > weird for the case where userspace is using a builder process, but even in > > > that case it's not flat out wrong to account per-enclave memory allocations. > > > > I did not find a single call site that would use that for allocating > > memory for function-internal data. > > Actually, the fact that the allocations are transient is an even better > argument for accounting the memory, as the weirdness I was referring to > doesn't exist for the builder concept. > > But looking more closely, Documentation/core-api/memory-allocation.rst > states: > > * Untrusted allocations triggered from userspace should be a subject > of kmem accounting and must have ``__GFP_ACCOUNT`` bit set. There > is the handy ``GFP_KERNEL_ACCOUNT`` shortcut for ``GFP_KERNEL`` > allocations that should be accounted. > > That means all uses of GFP_KERNEL except in sgx_alloc_epc_section() should > be converted to GFP_KERNEL_ACCOUNTED. As is, depending on fd limits[*], a > single process can easily burn through multiple GBs of memory simply by > opening /dev/sgx/enclave in a loop. What does the documentation mean by untrusted allocaton? __GFP_ACCOUNT kernel and GFP_KERNEL_ACCOUNT are both quite alien flags to me as is kmemcg. Things that I know that exist but have never had to deal with them. Looking at the kernel source code they rarely get used. Many drivers have process bound data structures but none of the drivers use these flags. I'm wondering why. Why sgx_alloc_epc_section() is a use case given that it is something that allocates memory for the global EPC database? > [*] AFAICT, systemd is upping the max number of open files to 1M on my > systems. I don't _think_ I changed a setting anywhere? /Jarkko