RE: Call for Community Input: Web Analytics on www.ietf.org

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Roman,

Your response actually raises (at least for me) some additional
questions (saving for later issues on which you have promised
details later)...

--On Wednesday, May 22, 2019 20:49 +0000 Roman Danyliw
<rdd@xxxxxxxx> wrote:

>...
> A few answers below.
> 
>> -----Original Message-----
>> From: Stephen Farrell [mailto:stephen.farrell@xxxxxxxxx]
>> Sent: Tuesday, May 21, 2019 11:43 AM
>> To: Roman Danyliw <rdd@xxxxxxxx>; ietf@xxxxxxxx
>> Subject: Re: Call for Community Input: Web Analytics on
>> www.ietf.org
>...
 
>> - Do the IESG plan to evaluate the utility of this
>>   with the possibility to ditch it if it doesn't
>>   in fact tell us something useful? If so, when?
>>   How will you decide if it's worth keeping?
> 
> In the "Implementation" section the proposal notes that
> "[f]ollowing finalization and implementation of the proposal,
> ...  the web analytics and reports will be reviewed by the
> IETF Tools Team after one-year to confirm they are delivering
> anticipated results."  The IETF Tools Team will bring a
> recommendation to the IESG.  Whether these analytics are worth
> keeping will be determined by whether they informed site
> improvement (as outlined in the "Introduction" section).

I'm still not clear as to why this effort is needed at all.  I
am sympathetic to Keith Moore's observations which I read as
being about collecting measurements and doing statistics that
are easy rather than digging down far enough to determine which
are needed for specific purposes, and then figuring out how, if
possible, to gather those statistics and keeping them as focused
as possible.  Gathering data because we can, because tools are
readily available to capture certain data, and then passing it
off to the Tools Team to figure out what it is good for (or
whether it provided "anticipated results" without that being
specified in advance) does not strike me as a good way to
proceed.  It also violates the most basic of privacy protection
principles, which is that data that are not collected are data
one don't have to worry about securing, retention times, etc.

It seems to me that the first step in this process should be a
clear statement from the Tools Team or other decision makers
about how they expect to use the data and what data are needed
for that purpose.  That expectation is supported by a statement
in the Proposal that says "website analytics must be implemented
to ... ● limit data being collected to that needed to serve
specific identified purposes".  The first two items in the list
of data to be reported are:
  ● overall number of visitors;
  ● views per webpage;

Why do we care?  If, e.g.,
https://www.ietf.org/about/groups/iesg/members/ does not attract
as much traffic as, e.g.,
https://www.ietf.org/how/meetings/upcoming/ does that mean we
are going to take it down?  Redesign it with more animation in
the hope of drawing additional traffic?  That is a silly
example, but I trust the problem is clear.  

The next paragraph starts "After considering several options for
implementing analytics,...", which sounds a lot like we have
skipped over "why" and "what" to get to "how".

However, assuming for purposes of discussion that this is really
needed for some useful purpose...

>> - Will this new information be shared with anyone
>>   else (e.g. ISOC as allowed for in [2]).
> 
> The proposal outlines that the "IETF Secretariat,
> communications staff, and the IESG" 
>...
> I'll have to follow-up on the additional users (ISOC) implied
> by [2].

I note that the Tools Team, who are explicitly called out as
getting the data, are, except for individual coincidences, not
part of the IETF Secretariat, the communications staff (I think
I know who/what that means, but am not sure), or the IESG, so
the list of parties with whom information is shared is a
superset of that list even before ISOC staff is considered.  It
is also interesting that neither the IETF LLC Exec Director nor
the IETF LLC Board are on the list of people to be given access
to the data, something that would probably make it hard to
evaluate the results and utility of this work.  Given that this
sort of thing isn't free even if we (volunteers or the
Secretariat) maintain the software on our own equipment, I'd
hope that sort of evaluation would be part of any ongoing effort.

>> - Does this constitute tracking behaviour? The
>>   current privacy policy [2] says we don't do that.
> 
> My read is no.

reporting
● traffic sources; and
● aggregated visitor profiles (including OS, browser, and
primary languages)
● visitors' paths through the site (including time spent on
webpages, as well as
entry and exit pages

Certainly sounds like tracking behavior to me.   I'm interested
in why you don't read it that way because we may have different
definitions.

> [3] says that "tracking is the collection of data regarding a
> particular user's activity across multiple distinct contexts
> and the retention, use, or sharing of data derived from that
> activity outside the context in which it occurred. A context
> is a set of resources that are controlled by the same party or
> jointly controlled by a set of parties."

By this definition, one can do just about anything one likes to
capture information about user behavior as long as that user
doesn't leave "a context" and then defining "context" in an
appropriately broad way.  In particular...

> *.ietf.org servers are single context controlled by the same
> party (IETF).  The proposed implementation plan is a
> self-hosted solution which does indeed collect activity data
> but NOT across "multiple, distinct contexts".

Really?  First, whether *.ietf.org servers are "controlled by
the same party" is questionable.  One could suggest that they
are "controlled by" some combination of the IETF LCC, AMS, the
Tool Team, and maybe some "cloud" or "CDN" suppliers.  Perhaps
at least some of those relationships are tightly enough
specified contractually to make the control by the IETF LLC (not
the IESG) clear, but "the IETF" (as a community of participants)
doesn't know enough about the details of those contracts to be
confident about that.  And, unless the Tools Team started being
subject to Close Technical Supervision while I wasn't looking,
pages in ietf.org subdomains they control and manage cannot be
said to be "controlled by" the IETF.
 
IMO, lots of loose ends here.  From my point of view, the most
important is why we actually need to do this, what we hope to
accomplish, and, from those things, what data we actually need.
The "information to be collected" part of the proposal is not
especially helpful in this regard.  Some of the comments do help
in imagining what is intended and why, but I don't think we
should need to imagine.

best,
   john





[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Mhonarc]     [Fedora Users]

  Powered by Linux