Re: smolt privacy policy

Mike McGrath <mmcgrath@xxxxxxxxxx> · Tue, 20 Feb 2007 21:55:31 -0600

Christopher Blizzard wrote:
On Mon, 2007-02-19 at 12:08 -0600, Mike McGrath wrote:

That aside, I think what you have is a good start.  I would start by
listing out what we're collecting, how we connect that to people (or
not) and how we're going to use it.  And start out with why we're doing
it so that people understand our motivation.  Or, another way to put it,
what is the acceptable use policy for the information and how it affects
others.

Yeah, it'd be good to explicitly state what we're going to do with the 
data and then actually follow it as a benchmark for how useful smolt is 
to us.  If we say we're going to do a bunch of things and this time next 
year we haven't, that would be bad :-)
Google's privacy policy is pretty good for its format.  (I won't comment
about the content.)

http://www.google.com/privacypolicy.html

The EFF has some decent resources:

http://www.eff.org/Privacy/

I'll grab some ideas.
But that aside, I think that we need to lay down some ground rules for
what we want to have as outcomes.  Here are my personal views on what we
should try to explain in the policy:

1. That we collect information about the hardware you have in your
machine as well as things that are connected to your machine.

+ packages in the near future.
2. That information is linked with a unique identifier, if the user
chooses to provide one.  This identifier is only there to determine if a
driver breaks or gets better over time.  (It's not just about leverage,
it's also about quality metrics we can add later.)

3. That unique identifier is never connected to an IP address.

Thats not quite true...  It is linked in the web logs and that is by 
design (abuse prevention / correction).  Those logs are kept for a 
finite amount of time and is listed in the policy.
4. Information about hardware is only released to the public in
aggregate.  That is, we will never release information about a specific
users, only about trends and groups of users.

5. That anyone who has access to the raw data that makes up the
aggregate will be required to enforce this policy and will not release
specific information to the public.

I've debated this off and on.  Honestly I think it would be great to 
make the databases available to the public, I can't imagine what harm it 
would do and people could run their own queries against the database as 
they want to.  For some reason though, in the back of my mind this seems 
like a bad idea.  I can't give specific reasons why.

   -Mike

_______________________________________________
fedora-advisory-board mailing list
fedora-advisory-board@xxxxxxxxxx
http://www.redhat.com/mailman/listinfo/fedora-advisory-board