On Jun 22, 2009, at 7:03 PM, Amos Jeffries wrote:
On Mon, 22 Jun 2009 15:50:51 -0300, Alejandro Martinez
<amartinez@xxxxxxxxxxxxxx> wrote:
Amos, thanks for your reply.
The last question, how do youy estimate that time ??
Now there is a deep philosophical question. You would probably best
look up
research info to get an accurate answer. [...]
I realize that I'm jumping in late into this discussion, but I think
that the most productive route would be to look at the algorithms used
by website statistics programs which claim to report such things.
Basically: requests are broken into 3 sets: unprocessed, selected,
and
finished.
1) take request X as a starting point.
2) Then select all requests that have X as a referrer made within 30
seconds (user max attention span for page load time).
3) Repeat (2) until no more requests are added using referer info.
4) Then pick a timespan T and select all requests made from same IP
as X
within time T of the existing range.
5) shuffle the first request we started with in (1) into the finished
pile.
5) for each request now selected (4) go back and repeat (1)->(4) for X
being that selected request.
6) repeat steps (2)->(5) until there are no new requests to handle.
7) track the earliest and latest timestamp from all requests
processed in
the above. Your 'visitor time' for _one_ session is the difference
between
those two.
That seems plausible to me. Considering the wide spread use of NAT,
I'd recommend using User-agent along with IP in step 4.
I highly recommend finding an existing tool that does all this for
you.
Writing it from scratch is complicated at best and I'm sure I missed
some
complex issue out of my quick description above.
Agreed. And I think that the place to look for existing tools is in
various webstats programs.
Cheers,
-j
--
Jeffrey Goldberg http://www.goldmark.org/jeff/