I should clarify something I said above - I know that some data must be encrypted within the database. My point was that during the design phase you should take the schema, cross out all columns that will be encrypted, and ask what's left. If it's nothing but the primary key and things you'll never sort on (or require uniqueness) then it's reasonable to ask if storing the information in a relational database is the best approach. An alternative would be storing each encrypted record as an S3 file or Hadoop HBase record. In both cases the filename/key would be based on something like a hashed email address. In the case of S3 it's important to remember that there is a limit on the number of entries although I don't remember what it is. Some people have hit it when using it like this.
It's a little more work but it's arguably a lot more secure if you make full use of AWS and Hadoop security. On the other hand keeping everything in a single place is a lot more convenient and it only requires one security audit, not two.
By 'shadow' table I mean that there's no reason why everyone needs access to everything. A classic example is the user table. Nearly everyone keeps the password (hopefully hashed!) in the user table. There's no reason to do this - it's just as easy to keep the passwords in a separate table with the same primary key. That will only affect the bits of your app that handle authentication - verifying a password or changing it. If anyone manages to do a SQL injection attack to list the contents of the user table they'll learn the user accounts but probably won't get the user's passwords, esp. if you give that table some innoculous name instead of 'password'.
But since such a limited amount of code that requires access to that table you can make a design decision that you'll use a separate database user when authenticating a user. That user will have access to the password table but the regular user won't have any access to it. If you want to be really secure you could put the password table into a different schema. The table is then entirely hidden from someone who has full access to your main schema. Hence "shadow" table. (Plus the fact that user passwords are kept in the /etc/shadow file on unix/linux systems.)
This is a pretty powerful technique. E.g., some people add a column that keeps a salted hash of the data in a row. An intruder won't know to change the hash value so you can dramatically improve security by checking that hash value when retrieving a value from the database. If it doesn't match you throw an alert and refuse to use the value. However this leaves a trace on the table to the intruder knows that their change will be detected. That's often a Good Thing - it's a deterrent - but you might prefer to let intruders make changes so you can determine they're in your system. You could use a shadow table to hold the hashes and view so that your app always sees the hash but an intruder doesn't.
You could even go a step further and have a trigger that updates a shadow table with the 'before' and 'after' values on an insert/update/delete. (Or pgaudit if you're on postgresql specifically - the general approach works with any database that supports triggers.) Nobody intruder knows about this, no application developer knows about this, only the DBA and security team. That lets you track down every change - what changed, when, who did it (including IP address if you capture connection info), etc. At most an intruder might see there's a trigger but if it works via a stored procedure with the appropriate permissions they won't know what you're doing with it.
The one thing to be careful about is that you can't use a trigger to update the hash value mentioned above. That would also be fired by an intruder so it won't tell you anything. It has to be done programmatically, but (at least with java) it could be implemented by a security team that uses AOP so the developers know there's a few extra columns but they don't do anything with them... or if you're paranoid the security team has its own persistence mechanism so the developers are completely unaware that this is happening since the hash is written to a different schema & table.
Hmm, I should probably write a blog entry for this...
Bear
On Mon, Aug 6, 2018 at 8:13 AM, Evan Bauer <evanbauer@xxxxxxx> wrote:
Bejita,
I suggest you step back and think about the problem from the point of view of the desired security outcome — that of protecting data from improper use by administrators. Some of the elements that (to my mind) ought to be part of achieving that outcome are:
1. Determine and document your organizations data access policies. They could be very simple, but it is important to document and share them.
2. Make use of a privileged access management scheme so that no one has unfettered access to superuser (postgres, root, et al) passwords, but has to check them out from an audited system for a specific task and time period, with appropriate approval processes if needed.
3. Use pgaudit to maintain an independent record of all sensitive access. The doc is at: https://github.com/pgaudit/pgaudit/blob/master/README.md
4. Create a set of administrative roles privileged to only the needs of the tasks required. Under normal circumstances, no one should use the ‘postgres’ account for production access. This also provides a means of enforcing compliance to your policies. Tom Vondra wrote a good introduction here: https://blog.2ndquadrant.com/auditing-users-and-roles-in- postgresql/
5. Setup automated (I tend to use ELK or Splunk) examination of the audit logs for violations and anomalies. Human review at regular intervals will also make your regulators or security auditors happier (they are never really happy.)
6. Make use of row-level access control and encryptions as appropriate to protect your data. This blog post by Jonathan Katz is a good introduction: https://info.crunchydata.com/blog/a-postgresql-row-level- security-primer-creating- large-policies
There is a lot of thought and work that goes into executing the steps above, but administering systems and databases that handle sensitive data is a serious responsibility and requires requirements definition, planning, architecture, execution, and then continuous monitoring and improvement. As someone new to the DBA role, you should talk to your architecture colleagues as you have some good and serious work ahead of you.
Cheers,
- Evan
> On Aug 6, 2018, at 09:43, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
>
> Bear Giles <bgiles@xxxxxxxxxxxxxx> writes:
>> In postgresql the equivalent user is 'postgres'. Nobody should ever be
>> logged in as that user once you've created the initial user(s). What
>> postgresql calls a 'superuser' is just a user with a few permissions set by
>> default. It's easy to grant the same privileges to any user, or drop them
>> from someone created as a superuser.
>
> Well, more to the point, a superuser is somebody with the rolsuper bit
> set in their pg_authid entry. You can revoke the bootstrap superuser's
> superuserness if you have a mind to -- see ALTER USER. However, as
> everyone has pointed out already, this is a bad idea and you will end
> up undoing it. (Figuring out how to do that without a reinstall is left
> as penance for insisting on a bad idea. It is possible, and I think
> even documented.)
>
> However: a whole lot of what the bootstrap superuser can do is inherent
> in being the owner of all the built-in database objects, and that you
> cannot get rid of. Objects have to be owned by somebody.
>
> regards, tom lane
>