On 2012-04-18 09:56:44 AM, Kevin Fenzi wrote: > http://stackoverflow.com/questions/4552566/logging-ip-address-for-uniqueness-without-storing-the-ip-address-itself-for-priv > > has some ideas, but no great clear answer. > > http://bug.st/mod_anonstats seems to use md5. > > I'm assuming the consumer of these logs will process them after they > are hashed? In which case we do need to make sure the same ip hashes to > the same hash ? Or could we process them first, then hash the ip before > making the data public? I think something like an HMAC is the correct way to hide IPs. Unfortunately, there is still information other than IP address that can potentially leak some privacy information, such as: * rare/unique user agent strings * URLs that can be be linked to the person who's visiting them (a lot of mailman links contain emails, for example) * potentially still-valid CSRF tokens I think a lot more thought and user notification should happen before we can consider making logs public. Alternatively, what do you think about a system where somebody who wanted to run statistics either gets access to the logs, or gives us a script that we'll verify and then run in a cronjob. I don't think we'll get enough requests to the point where doing things manually like this becomes a burden. Maybe we can also take a look at how organizations like wikipedia handle these sorts of things. Thanks, Ricky
Attachment:
pgpLaieCdPkPT.pgp
Description: PGP signature
_______________________________________________ infrastructure mailing list infrastructure@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/infrastructure