Re: [Fedora-legal-list] Making Infrastructure httpd logs public

Ricky Zhou <ricky@xxxxxxxxxxxxxxxxx> · Wed, 18 Apr 2012 12:27:32 -0400

On 2012-04-18 09:56:44 AM, Kevin Fenzi wrote:
> http://stackoverflow.com/questions/4552566/logging-ip-address-for-uniqueness-without-storing-the-ip-address-itself-for-priv
> 
> has some ideas, but no great clear answer. 
> 
> http://bug.st/mod_anonstats seems to use md5. 
> 
> I'm assuming the consumer of these logs will process them after they
> are hashed? In which case we do need to make sure the same ip hashes to
> the same hash ? Or could we process them first, then hash the ip before
> making the data public?
I think something like an HMAC is the correct way to hide IPs.
Unfortunately, there is still information other than IP address that can
potentially leak some privacy information, such as:
 * rare/unique user agent strings
 * URLs that can be be linked to the person who's visiting them (a lot
   of mailman links contain emails, for example)
 * potentially still-valid CSRF tokens

I think a lot more thought and user notification should happen before we
can consider making logs public.  Alternatively, what do you think about
a system where somebody who wanted to run statistics either gets access
to the logs, or gives us a script that we'll verify and then run in a
cronjob.  I don't think we'll get enough requests to the point where
doing things manually like this becomes a burden.

Maybe we can also take a look at how organizations like wikipedia handle
these sorts of things.

Thanks,
Ricky
Attachment:
pgpLaieCdPkPT.pgp

Description: PGP signature
_______________________________________________
infrastructure mailing list
infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/infrastructure