On 06-03-2017 12:45, Henrik Danielsson via arch-general wrote: > 2017-03-06 12:53 GMT+01:00 Mauro Santos via arch-general < > arch-general@xxxxxxxxxxxxx>: > >> On 06-03-2017 11:20, Henrik Danielsson via arch-general wrote: >>> 2017-03-06 11:18 GMT+01:00 Ralf Mardorf <silver.bullet@xxxxxxxx>: >>>> >>>> Privacy is a principle. You seem not to understand the difference >>>> between giving somebody data with the formal permission to use this data >>>> and data that simply is available for everybody, but not explicitly >>>> handed over to somebody. Paranoia isn't involved in my concern. >>>> >>> My standpoint is that privacy does not apply to this kind of public >>> information, simply because it's not private and by no means sensitive >>> (people freely chose the username and other visible info they posted, >> no?). >>> Thus, no, I see no difference and really no point in even considering >>> trying to keep such information private. >>> >>> What anyone does with the freely available information posted in the AUR >> is >>> up to them ("mining" it or handing it over to someone else included), we >>> could not do anything about it anyway, nor would I even care if I was in >>> that list or not, since there seems to be no ToS between the one >> submitting >>> that information and the one publishing it. Since it was freely submitted >>> without any terms, I can simply not find any restrictions on its usage. >>> >>> Yes, we should have a ToS to at least keep the principle of privacy >> alive. >>> But let's face it, real privacy online has been dead for long, if it ever >>> existed. >>> >>> If there was a ToS, the situation would perhaps have been different, at >>> least legally. I'm no legal expert of course, but to me it makes perfect >>> sense that if you posted something on the internet, in a very public >> space, >>> you can have no expectations of keeping any of that information private >> in >>> any way, nor any information easily associated with. >>> No, I don't see that as a problem, at least not if you never explicitly >>> agreed that information would not be shared. What I really want to keep >>> private I don't post anywhere. >>> >> >> I think the point here is not so much privacy, as I believe everyone >> recognizes that the information that was asked for (the full list of >> usernames) is public and can be scraped. >> >> The point here is handing over the full list of usernames on request. Do >> note that in their research proposal[1] they specifically mention >> scraping information from github. That information is public, github >> does have an API to query that information, but they still have to >> scrape it, I suppose that implies github does not hand it over wholesale >> on request, why should we? This might be due to their ToS or they know >> something we don't. >> > It would be rather interesting to see what they could come up with from > that correlation. Probably nothing meaningful. As I've said before you have no way of knowing if user foo on github is the same as user foo on the AUR. > I think, perhaps a bit cynically, the reason github may not hand over that > data directly is likely that they don't want to do some of the work of the > researchers for them. As you said, the data is there, the format matters > less if they're going to massage it into something else later anyway, so > why bother with the effort of compiling it on their [github] own time? > > We could simply deny the AUR username request it for the same reason, or no > reason at all. Since some people seem uncomfortable about what could be > derived from a potential correlation of publicly available data, that's > most likely the safest way to go. > -- Mauro Santos