On 04/17/2012 01:56 PM, Ian Weller wrote: > On Tue, Apr 17, 2012 at 01:36:50PM -0400, Tom Callaway wrote: >> On 04/17/2012 01:15 PM, Ian Weller wrote: >>> As part of the statistics++ project [1] it is Infrastructure's plan to >>> make data about visits to Fedora Project web servers public, in order to >>> automate the information made available on the Statistics wiki page. >>> >>> The httpd logs currently contain personally-identifiable information: >>> the IP address the request originated from and the user agent header. >>> >>> We think that at an absolute minimum we need to hash the IP address >>> (with a seed, obviously) and leave the user agent header as is. But we >>> wanted to make sure we got legal's opinion on this. >> >> Can you show me a hypothetical "before" and "after" example? > > Assuming we treat the logs as described above: > > Before example (private, on Fedora log servers): > > 66.391.22.111 - - [15/Apr/2012:04:02:56 +0000] "GET /static/css/fedora960-lang.css HTTP/1.0" 200 233 "http://start.fedoraproject.org/index.html.en" "Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20100101 Firefox/11.0" > > After example (available to public via statistics++ project): > > c9326fa15a1d8a773386ddcdc16132f8 - - [15/Apr/2012:04:02:56 +0000] "GET /static/css/fedora960-lang.css HTTP/1.0" 200 233 "http://start.fedoraproject.org/index.html.en" "Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20100101 Firefox/11.0" > > Requests from the same IP address would have the same hash so that we > could run scripts that count unique visitors. As long as we're not somehow embedding any geo-ip information in those hashes, I think we're okay. ~tom == Fedora Project _______________________________________________ legal mailing list legal@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/legal