On Fri, Nov 30, 2018 at 1:20 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > On Fri, Nov 30, 2018 at 3:43 AM Sebastian Andrzej Siewior > <bigeasy@xxxxxxxxxxxxx> wrote: > > > > On 2018-11-29 13:43:58 [-0800], Kees Cook wrote: > > > On Mon, Nov 26, 2018 at 9:04 AM Sebastian Andrzej Siewior > > > <bigeasy@xxxxxxxxxxxxx> wrote: > > > This bug got handled by Jann Horn, yes? (I remember seeing a related > > > thread go by...) > > > > Correct, fix sits in the tip tree. Nevertheless it was useful to spot > > the other thing. > > > > > > This looks like it comes from commit 21b3ddd39fee ("efi: Don't use > > > > spinlocks for efi vars") because it replaced a spin_lock() with > > > > down_interruptible() in this case. In this case, since pstore_dump() > > > > holds psinfo->buf_lock with disabled interrupts we can't block… > > > > > > > > I have this as a workaround: > > > > > > I'm not sure this is correct... > > > > > > > diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c > > > > index 9336ffdf6e2c..d4dcfe46eb2e 100644 > > > > --- a/drivers/firmware/efi/vars.c > > > > +++ b/drivers/firmware/efi/vars.c > > > > @@ -755,7 +755,9 @@ int efivar_entry_set_safe(efi_char16_t *name, efi_guid_t vendor, u32 attributes, > > > > return -EINTR; > > > > } > > > > > > > > - status = check_var_size(attributes, size + ucs2_strsize(name, 1024)); > > > > + status = ops->query_variable_store(attributes, > > > > + size + ucs2_strsize(name, 1024), > > > > + block); > > > > > > Why does this change help? > > > > check_var_size() uses false as the argument for `block'. > > check_var_size_nonblocking uses true as the argument for `block'. > > Doing this ops->… allows to pass the `block' argument as received from > > caller. And the functions above already use the `block' argument. Plus > > the next hunk makes sure that `block' is set properly. > > Here's the code I see: > > if (!block && ops->set_variable_nonblocking) > return efivar_entry_set_nonblocking(name, vendor, attributes, > size, data); > > if (!block) { > if (down_trylock(&efivars_lock)) > return -EBUSY; > } else { > if (down_interruptible(&efivars_lock)) > return -EINTR; > } > > status = check_var_size(attributes, size + ucs2_strsize(name, 1024)); > if (status != EFI_SUCCESS) { > up(&efivars_lock); > return -ENOSPC; > } > > status = ops->set_variable(name, &vendor, attributes, size, data); > > up(&efivars_lock); > > In the !block case, this will already take the > efivar_entry_set_nonblocking() path. Isn't that sufficient? (i.e. I'm > not sure why I see a change needed in the EFI code, just the > pstore_cannot_block_path() path? > > > > > > > if (status != EFI_SUCCESS) { > > > > up(&efivars_lock); > > > > return -ENOSPC; > > > > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c > > > > index b821054ca3ed..9aa27a2c8d36 100644 > > > > --- a/fs/pstore/platform.c > > > > +++ b/fs/pstore/platform.c > > > > @@ -127,10 +127,10 @@ static const char *get_reason_str(enum kmsg_dump_reason reason) > > > > bool pstore_cannot_block_path(enum kmsg_dump_reason reason) > > > > { > > > > /* > > > > - * In case of NMI path, pstore shouldn't be blocked > > > > - * regardless of reason. > > > > + * In case of disabled preemption / interrupts we can't block on any > > > > + * lock regardless of reason. > > > > */ > > > > - if (in_nmi()) > > > > + if (in_atomic() || irqs_disabled()) > > > > return true; > > > > > > > > switch (reason) { > > > > > > I'd like to avoid any case where pstore might "miss" a crash report, > > > though... this seems to make it easier for it to fail to record a > > > crash, yes? > > > > Well, if the lock is occupied, yes. But waiting for the completion with > > disabled interrupts isn't much better :) > > > > pstore_cannot_block_path() is used in two places. One caller is taking a > > spin_lock(). So that one could keep using the "old" function. > > > > The other caller uses it before acquiring the semaphore. The non-try > > version can't be used with preemption or interrupts disabled. > > > > You could split those two cases and let the sempaphire user use the > > "(in_atomic() || irqs_disabled())" from above. > > preempt.h:#define in_nmi() (preempt_count() & NMI_MASK) > preempt.h:#define in_atomic() (preempt_count() != 0) > > NMI is a special case of preempt_count() test. in_atomic() is a > non-zero preempt_count() test. It looks like we want to use > preemptible()? > > #define preemptible() (preempt_count() == 0 && !irqs_disabled()) > > So, perhaps: > > - if (in_nmi()) > + if (!preemptible()) > > What do you think? I spent some more time looking at this, and I think the best fix would be to switch pstore also to a semaphore, fix up the logic, and then replace the efi call to !pstore_cannot_block_path() with a preemptible() call: ret = efivar_entry_set_safe(efi_name, vendor, PSTORE_EFI_ATTRIBUTES, - !pstore_cannot_block_path(record->reason), - record->size, record->psi->buf); + preemptible(), record->size, record->psi->buf); -- Kees Cook