On Tue, Oct 13, 2020 at 11:43:57AM -0700, Dave Hansen wrote: > > +static inline void pks_update_protection(int pkey, unsigned long protection) > > +{ > > + current->thread.saved_pkrs = update_pkey_val(current->thread.saved_pkrs, > > + pkey, protection); > > + preempt_disable(); > > + write_pkrs(current->thread.saved_pkrs); > > + preempt_enable(); > > +} > > Why does this need preempt count manipulation in addition to the > get/put_cpu_var() inside of write_pkrs()? This is a bug. The disable should be around the update_pkey_val(). > > > +/** > > + * PKS access control functions > > + * > > + * Change the access of the domain specified by the pkey. These are global > > + * updates. They only affects the current running thread. It is undefined and > > + * a bug for users to call this without having allocated a pkey and using it as > > + * pkey here. > > + * > > + * pks_mknoaccess() > > + * Disable all access to the domain > > + * pks_mkread() > > + * Make the domain Read only > > + * pks_mkrdwr() > > + * Make the domain Read/Write > > + * > > + * @pkey the pkey for which the access should change. > > + * > > + */ > > +void pks_mknoaccess(int pkey) > > +{ > > + pks_update_protection(pkey, PKEY_DISABLE_ACCESS); > > +} > > +EXPORT_SYMBOL_GPL(pks_mknoaccess); > > These are named like PTE manipulation functions, which is kinda weird. > > What's wrong with: pks_disable_access(pkey) ? Internal review suggested these names. I'm not dead set on them. FWIW I would rather they not get to wordy. I was trying to get some consistency with pks_mk*() as meaning PKS 'make' X. Do me 'disable' implies a state transition where 'make' implies we are 'setting' an absolute value. I think the later is a better name. And 'make' made more sense because 'set' is so overloaded IHO. > > > +void pks_mkread(int pkey) > > +{ > > + pks_update_protection(pkey, PKEY_DISABLE_WRITE); > > +} > > +EXPORT_SYMBOL_GPL(pks_mkread); > > I really don't like this name. It doesn't make readable, or even > read-only, *especially* if it was already access-disabled. Ok. But it does sense if going from access-disable to read, correct?. I could see this being better named pks_mkreadonly() so that going from RW to this would make more sense. Especially after thinking about it above 'read only' needs to be in the name. Before I change anything I'd like to get consensus on naming. How about the following? pks_mk_noaccess() pks_mk_readonly() pks_mk_readwrite() ? > > > +static const char pks_key_user0[] = "kernel"; > > + > > +/* Store names of allocated keys for debug. Key 0 is reserved for the kernel. */ > > +static const char *pks_key_users[PKS_NUM_KEYS] = { > > + pks_key_user0 > > +}; > > + > > +/* > > + * Each key is represented by a bit. Bit 0 is set for key 0 and reserved for > > + * its use. We use ulong for the bit operations but only 16 bits are used. > > + */ > > +static unsigned long pks_key_allocation_map = 1 << PKS_KERN_DEFAULT_KEY; > > + > > +/* > > + * pks_key_alloc - Allocate a PKS key > > + * > > + * @pkey_user: String stored for debugging of key exhaustion. The caller is > > + * responsible to maintain this memory until pks_key_free(). > > + */ > > +int pks_key_alloc(const char * const pkey_user) > > +{ > > + int nr; > > + > > + if (!cpu_feature_enabled(X86_FEATURE_PKS)) > > + return -EINVAL; > > I'm not sure I like -EINVAL for this. I thought we returned -ENOSPC for > this case for user pkeys. -ENOTSUP? I'm not really sure anyone will need to know the difference between the platform not supporting the key vs running out of them. But they are 2 different error conditions. > > > + while (1) { > > + nr = find_first_zero_bit(&pks_key_allocation_map, PKS_NUM_KEYS); > > + if (nr >= PKS_NUM_KEYS) { > > + pr_info("Cannot allocate supervisor key for %s.\n", > > + pkey_user); > > + return -ENOSPC; We return -ENOSPC here when running out of keys. > > + } > > + if (!test_and_set_bit_lock(nr, &pks_key_allocation_map)) > > + break; > > + } > > + > > + /* for debugging key exhaustion */ > > + pks_key_users[nr] = pkey_user; > > + > > + return nr; > > +} > > +EXPORT_SYMBOL_GPL(pks_key_alloc); > > + > > +/* > > + * pks_key_free - Free a previously allocate PKS key > > + * > > + * @pkey: Key to be free'ed > > + */ > > +void pks_key_free(int pkey) > > +{ > > + if (!cpu_feature_enabled(X86_FEATURE_PKS)) > > + return; > > + > > + if (pkey >= PKS_NUM_KEYS || pkey <= PKS_KERN_DEFAULT_KEY) > > + return; > > This seems worthy of a WARN_ON_ONCE() at least. It's essentially > corrupt data coming into a kernel API. Ok, Done, Ira