2017-03-06 14:36 GMT+01:00 Mauro Santos via arch-general <arch-general@xxxxxxxxxxxxx>: > On 06-03-2017 12:45, Henrik Danielsson via arch-general wrote: >> 2017-03-06 12:53 GMT+01:00 Mauro Santos via arch-general < >> arch-general@xxxxxxxxxxxxx>: >> >>> On 06-03-2017 11:20, Henrik Danielsson via arch-general wrote: >>>> 2017-03-06 11:18 GMT+01:00 Ralf Mardorf <silver.bullet@xxxxxxxx>: >>>>> >>>>> Privacy is a principle. You seem not to understand the difference >>>>> between giving somebody data with the formal permission to use this data >>>>> and data that simply is available for everybody, but not explicitly >>>>> handed over to somebody. Paranoia isn't involved in my concern. >>>>> >>>> My standpoint is that privacy does not apply to this kind of public >>>> information, simply because it's not private and by no means sensitive >>>> (people freely chose the username and other visible info they posted, >>> no?). >>>> Thus, no, I see no difference and really no point in even considering >>>> trying to keep such information private. >>>> >>>> What anyone does with the freely available information posted in the AUR >>> is >>>> up to them ("mining" it or handing it over to someone else included), we >>>> could not do anything about it anyway, nor would I even care if I was in >>>> that list or not, since there seems to be no ToS between the one >>> submitting >>>> that information and the one publishing it. Since it was freely submitted >>>> without any terms, I can simply not find any restrictions on its usage. >>>> >>>> Yes, we should have a ToS to at least keep the principle of privacy >>> alive. >>>> But let's face it, real privacy online has been dead for long, if it ever >>>> existed. >>>> >>>> If there was a ToS, the situation would perhaps have been different, at >>>> least legally. I'm no legal expert of course, but to me it makes perfect >>>> sense that if you posted something on the internet, in a very public >>> space, >>>> you can have no expectations of keeping any of that information private >>> in >>>> any way, nor any information easily associated with. >>>> No, I don't see that as a problem, at least not if you never explicitly >>>> agreed that information would not be shared. What I really want to keep >>>> private I don't post anywhere. >>>> >>> >>> I think the point here is not so much privacy, as I believe everyone >>> recognizes that the information that was asked for (the full list of >>> usernames) is public and can be scraped. >>> >>> The point here is handing over the full list of usernames on request. Do >>> note that in their research proposal[1] they specifically mention >>> scraping information from github. That information is public, github >>> does have an API to query that information, but they still have to >>> scrape it, I suppose that implies github does not hand it over wholesale >>> on request, why should we? This might be due to their ToS or they know >>> something we don't. >>> >> It would be rather interesting to see what they could come up with from >> that correlation. > > Probably nothing meaningful. As I've said before you have no way of > knowing if user foo on github is the same as user foo on the AUR. > True, but you could make a decent guess based on how many coincidences there are surrounding those names. Relations between names could be interesting even if the people behind them are not the same. >> I think, perhaps a bit cynically, the reason github may not hand over that >> data directly is likely that they don't want to do some of the work of the >> researchers for them. As you said, the data is there, the format matters >> less if they're going to massage it into something else later anyway, so >> why bother with the effort of compiling it on their [github] own time? >> >> We could simply deny the AUR username request it for the same reason, or no >> reason at all. Since some people seem uncomfortable about what could be >> derived from a potential correlation of publicly available data, that's >> most likely the safest way to go. >> > > > -- > Mauro Santos