Roman, Your response actually raises (at least for me) some additional questions (saving for later issues on which you have promised details later)... --On Wednesday, May 22, 2019 20:49 +0000 Roman Danyliw <rdd@xxxxxxxx> wrote: >... > A few answers below. > >> -----Original Message----- >> From: Stephen Farrell [mailto:stephen.farrell@xxxxxxxxx] >> Sent: Tuesday, May 21, 2019 11:43 AM >> To: Roman Danyliw <rdd@xxxxxxxx>; ietf@xxxxxxxx >> Subject: Re: Call for Community Input: Web Analytics on >> www.ietf.org >... >> - Do the IESG plan to evaluate the utility of this >> with the possibility to ditch it if it doesn't >> in fact tell us something useful? If so, when? >> How will you decide if it's worth keeping? > > In the "Implementation" section the proposal notes that > "[f]ollowing finalization and implementation of the proposal, > ... the web analytics and reports will be reviewed by the > IETF Tools Team after one-year to confirm they are delivering > anticipated results." The IETF Tools Team will bring a > recommendation to the IESG. Whether these analytics are worth > keeping will be determined by whether they informed site > improvement (as outlined in the "Introduction" section). I'm still not clear as to why this effort is needed at all. I am sympathetic to Keith Moore's observations which I read as being about collecting measurements and doing statistics that are easy rather than digging down far enough to determine which are needed for specific purposes, and then figuring out how, if possible, to gather those statistics and keeping them as focused as possible. Gathering data because we can, because tools are readily available to capture certain data, and then passing it off to the Tools Team to figure out what it is good for (or whether it provided "anticipated results" without that being specified in advance) does not strike me as a good way to proceed. It also violates the most basic of privacy protection principles, which is that data that are not collected are data one don't have to worry about securing, retention times, etc. It seems to me that the first step in this process should be a clear statement from the Tools Team or other decision makers about how they expect to use the data and what data are needed for that purpose. That expectation is supported by a statement in the Proposal that says "website analytics must be implemented to ... ● limit data being collected to that needed to serve specific identified purposes". The first two items in the list of data to be reported are: ● overall number of visitors; ● views per webpage; Why do we care? If, e.g., https://www.ietf.org/about/groups/iesg/members/ does not attract as much traffic as, e.g., https://www.ietf.org/how/meetings/upcoming/ does that mean we are going to take it down? Redesign it with more animation in the hope of drawing additional traffic? That is a silly example, but I trust the problem is clear. The next paragraph starts "After considering several options for implementing analytics,...", which sounds a lot like we have skipped over "why" and "what" to get to "how". However, assuming for purposes of discussion that this is really needed for some useful purpose... >> - Will this new information be shared with anyone >> else (e.g. ISOC as allowed for in [2]). > > The proposal outlines that the "IETF Secretariat, > communications staff, and the IESG" >... > I'll have to follow-up on the additional users (ISOC) implied > by [2]. I note that the Tools Team, who are explicitly called out as getting the data, are, except for individual coincidences, not part of the IETF Secretariat, the communications staff (I think I know who/what that means, but am not sure), or the IESG, so the list of parties with whom information is shared is a superset of that list even before ISOC staff is considered. It is also interesting that neither the IETF LLC Exec Director nor the IETF LLC Board are on the list of people to be given access to the data, something that would probably make it hard to evaluate the results and utility of this work. Given that this sort of thing isn't free even if we (volunteers or the Secretariat) maintain the software on our own equipment, I'd hope that sort of evaluation would be part of any ongoing effort. >> - Does this constitute tracking behaviour? The >> current privacy policy [2] says we don't do that. > > My read is no. reporting ● traffic sources; and ● aggregated visitor profiles (including OS, browser, and primary languages) ● visitors' paths through the site (including time spent on webpages, as well as entry and exit pages Certainly sounds like tracking behavior to me. I'm interested in why you don't read it that way because we may have different definitions. > [3] says that "tracking is the collection of data regarding a > particular user's activity across multiple distinct contexts > and the retention, use, or sharing of data derived from that > activity outside the context in which it occurred. A context > is a set of resources that are controlled by the same party or > jointly controlled by a set of parties." By this definition, one can do just about anything one likes to capture information about user behavior as long as that user doesn't leave "a context" and then defining "context" in an appropriately broad way. In particular... > *.ietf.org servers are single context controlled by the same > party (IETF). The proposed implementation plan is a > self-hosted solution which does indeed collect activity data > but NOT across "multiple, distinct contexts". Really? First, whether *.ietf.org servers are "controlled by the same party" is questionable. One could suggest that they are "controlled by" some combination of the IETF LCC, AMS, the Tool Team, and maybe some "cloud" or "CDN" suppliers. Perhaps at least some of those relationships are tightly enough specified contractually to make the control by the IETF LLC (not the IESG) clear, but "the IETF" (as a community of participants) doesn't know enough about the details of those contracts to be confident about that. And, unless the Tools Team started being subject to Close Technical Supervision while I wasn't looking, pages in ietf.org subdomains they control and manage cannot be said to be "controlled by" the IETF. IMO, lots of loose ends here. From my point of view, the most important is why we actually need to do this, what we hope to accomplish, and, from those things, what data we actually need. The "information to be collected" part of the proposal is not especially helpful in this regard. Some of the comments do help in imagining what is intended and why, but I don't think we should need to imagine. best, john