Re: PKCS#11 URI slot attributes & last call

Nico Williams <nico@xxxxxxxxxxxxxxxx> · Wed, 31 Dec 2014 01:03:33 -0600

On Wed, Dec 31, 2014 at 07:29:47AM +0100, Patrik Fältström wrote:
> > On 30 dec 2014, at 20:53, Nico Williams <nico@xxxxxxxxxxxxxxxx> wrote:
> > Better say
> > nothing, because I think the thing to do is obvious enough, but if we
> > must say anything, it's that the various strings (e.g., token manuf)
> > are to be compared normalization-insensitively.
> 
> Sorry, but I have not heard the term "normalization-insensitively" before.
> 
> Can you explain what you mean?

Notionally, if you're comparing two unnormalized strings, you could
normalize both then compare the two normalized strings.

Of course, that can be inefficient (e.g., if it means allocating memory,
of if they will prove not equal in the first few codepoints) or
infeasible (e.g., if one of the strings is actually a hashed key to a
hash table).

What you can for the first case is compare code-unit by code-unit, with
a fast path for the cases that need no normalization, and normalizing
one character (but possibly multiple codepoints, of course) at a time.
This limits the total memory consumption, and anyways, for the common
case you can often expect an inequality result long before you're done
traversing the shorter string.  This is (can be, if you do it right), of
course, equivalent to normalizing both strings then comparing -- but it
should usually be much faster.

For the second case the thing to do is to normalize the key at hash
time, naturally.

[ZFS, incidentally, supports this for filesystem object names, and has
for years now.]

Now, PKCS#11 nowadays supports UTF-8 for things like "token label", but
it doesn't say anything about form -- why should it (see below)?

But where PKCS#11 URIs are intended to _match_ PKCS#11 resources by
name... apps will need to care about normalization.  In practice, like a
great many applications, doing nothing about normalization will probably
work fine (until the day that it doesn't).  But saying anything about
this could be tricky: what if there are two tokens with equivalent
labels, just in different forms?  Fortunately PKCS#11 URIs can match on
more attributes than labels, so there's that.

PKCS#11 should say "don't do that" or "don't do that, normalize to NFC"
(or NFD, whatever), but doesn't (or I didn't find where it does, if it
does), so the most that this document could say is "compare
normalization-insensitively where possible".

Nico
--