Bill Moran wrote:
John R Pierce <pierce@xxxxxxxxxxxx> wrote:
Eric Soroos wrote:
an index on the encrypted SSN field would do this just fine. if
authorized person needs to find the record with a specific SSN, they
encrypt that SSN and then look up the ciphertext in the database...
done.
This will only work for e(lectronic?) code book ciphers, and not
chained block ciphers, since the initialization vector will randomize
the output of the encryption so that E(foo) != E(foo) just to prevent
this sort of attack.
can those sorts of chained block ciphers decode blocks in a different
order than they were originally encoded? for this sort of
application, wouldn't each field or record pretty much have to be
encrypted discretely so that they can be decrypted in any order, or any
single record be decrypted on its own?
Eric is right about CBC ciphers. The problem is that any function that
will produce the same output for the same input (such as md5 or sha) leaves
us open to brute force attacks if the number of choices is small, or
pattern discovery attacks in other cases. And anything that protects
us against such attacks (such as aes-cbc) will generate data that I
can't pre-encrypt and search against.
I haven't tried it, but I don't believe CBC ciphers can decrypt data out
of order.
In the implementation I've built, the IV is stored with the ciphertext,
much the same way that crypt() stores the salt with the password hash.
As a result, if you have the key, you then have all the data required
to decrypt the field, but you can't easily brute force it or do any
pattern analysis.
Searching encrypted data is difficult in a situation like this. There is
research (e.g. [1,2]) into encrypting relatively large _text_ fields so
that the ciphertext is amenable to search, but in general all the
schemes sacrifice functionality for security - partial matches, for
example, are relatively difficult to achieve and regular expressions are
virtually impossible without manual expansion. Plus you'd have to roll
your own, which would be prone to error.
My current university project is in this area, developing a system for
PostgreSQL that allows secure search on encrypted _text_ fields (e.g.
full documents) using [2], but for a field as small as a SSN, there
isn't really anything obvious.
Having said that, using a block cipher in ECB mode on the SSN should be
enough to be able to perform fast exact-matches (based on my limited
knowledge of SSNs). Assuming the 'user' table is normalised so all the
SSNs in it will be unique, the possibility of frequency analysis on the
ciphertext is slim, especially since a 9 digit SSN encoded as ASCII will
easily fit into a single block of most recent ciphers (AES has a 16-byte
block for example). Each SSN's ciphertext will be as unique as the
original. Similarly for a one-way hash function. An SSN will only have
1billion possible combinations, so a brute force attack would be
possible, but I don't see a way of avoiding this.
You could look into stream ciphers for left-most matches, but these are
almost always susceptible to statistical attacks when used incorrectly.
Generally speaking, searching requires a pattern, which leads to
possible attacks. I think you'll have to either put up with the
inefficiency or sacrifice some amount of security.
Cheers,
Will.
[1] http://www.cs.berkeley.edu/~dawnsong/papers/se.pdf
[2] http://gnunet.org/papers/secureindex.pdf
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general