On 2024-11-29 15:48, Herbert Xu wrote:
On Fri, Nov 29, 2024 at 12:10:58PM +0100, Harald Freudenberger wrote:
+static inline int phmac_keyblob2pkey(const u8 *key, unsigned int
keylen,
+ struct phmac_protkey *pk)
+{
+ int i, rc = -EIO;
+
+ /* try three times in case of busy card */
+ for (i = 0; rc && i < 3; i++) {
+ if (rc == -EBUSY && msleep_interruptible(1000))
+ return -EINTR;
You can't sleep in an ahash algorithm either. What you can do
however is schedule a delayed work and pick up where you left
off. That's how asynchronous completion works.
But my question still stands, under what circumstances can
this fail? I don't think storage folks will be too happy with
a crypto algorithm that can produce random failures.
Cheers,
- The attempt to derive a protected key usable by the cpacf instructions
depends of the raw key material used. For 'clear key' material the
derivation process is a simple instruction which can't fail.
A more preferred way however is to use 'secure key' material which
is transferred to a crypto card and then re-wrapped to be usable
with cpacf instructions. This requires communication with a crypto
card and thus may fail - because there is no card at all or there
is temporarily no card available or the card is in bad state. If there
is no usable card the AP bus returns -EBUSY at the pkey_key2protkey()
function and triggers an asynchronous bus scan. As long as this scan
is running (usually about 100ms or so) the -EBUSY is returned to
indicate
that the caller should retry "later". Other states are covered with
other return codes like ENODEV or EIO and the caller is not supposed
to loop but should fail. When there is no accessible hardware
available
to derive a protected key either the user or the admin broke something
or something went really the bad way and then there is no help but the
storage device must fail.
- How can it happen that a re-derive is needed? A re-derive is triggered
when
the cpacf instruction detects that the protected key is not valid any
more.
A protected key includes a verification pattern (hash) of the firmware
key
used to encrypt the key. This hash is checked on each invocation of a
cpacf instruction. So when the code execution "awakes" on another
machine
("live guest migration" of an KVM guest to another machine) the next
cpacf instruction will complain about verification pattern mismatch
and
the protected key needs to get re-derived from the source material.
It could also happen via suspend/resume on the very same machine when
there is something in between (for example the whole machine runs a
cold-start). It does NOT happen out of the sudden without any reason,
but the code affected is not aware of any "live guest migration" or
"suspend/resume cycle" and thus as the crypto algorithm implementation
has
no awareness of a "live guest migration" just happened - it looks like
this occurred suddenly.
- Do I get you right, that a completion is ok? I always had the
impression
that waiting on a completion is also a sleeping act and thus not
allowed?
Thanks for your help and being so patient with us.