The billing situation was resolved amicably, but in the spirit of understanding what really happened, we've been investigating the cause and effect. It appears that the traffic came from (legitimate, non-malicious) Far East users, probably on the other side of a problematic Internet connection, most likely caused by the damage to an undersea cable in December. The result of the network unreliability was that they had difficulty downloading large data intact from the VTP server, and employed a common download-accelerator that is able to keep retrying the server and resuming a file-transfer partway through. All of this is very normal and common.
From the logs, it would appear that the download accelerator requests the file with a Range header (with a fairly large range). Then, Apache begins sending the data. Shortly, the connection fails (after the client has only received a little of the data) and Apache logs the request. The process repeats. Eventually, the client does receive the entire file, but not until after many many 206 (partial-content) entries are logged.
According to the host (Hurricane Electric, HE.net): http://www.he.net/faq/traffic_storage.html "What Methods do you use to determine traffic usage?We determine web traffic usage by extracting information from the access_log files generated by the HTTP daemon."
Based on this, we come to some interesting contradictions: >From http://vterrain.org/.status/web.html Time Requests Bytes Sent ----- -------- ------------ 00:00 893 18955972 01:00 22384 141295771628 02:00 4330 3881123738 03:00 626 7740512 Yup, in one hour, a naive counting of Apache's "bytes transferred" says there was 141 GB of traffic. About a month's worth of normal traffic - in an hour. Apache's log has endless amounts of this: 211.162.235.226 - - [03/Feb/2007:01:08:12 -0800] "GET /data/island_8k.zip HTTP/1.1" 206 10359197 211.162.235.226 - - [03/Feb/2007:01:08:12 -0800] "GET /data/island_8k.zip HTTP/1.1" 206 42725451 211.162.235.226 - - [03/Feb/2007:01:08:12 -0800] "GET /data/island_8k.zip HTTP/1.1" 206 15765857 211.162.235.226 - - [03/Feb/2007:01:08:12 -0800] "GET /data/island_8k.zip HTTP/1.1" 206 37503695 211.162.235.226 - - [03/Feb/2007:01:08:12 -0800] "GET /data/island_8k.zip HTTP/1.1" 206 21176362 211.162.235.226 - - [03/Feb/2007:01:08:12 -0800] "GET /data/island_8k.zip HTTP/1.1" 206 5127115 211.162.235.226 - - [03/Feb/2007:01:08:12 -0800] "GET /data/island_8k.zip HTTP/1.1" 206 10359197 211.162.235.226 - - [03/Feb/2007:01:08:12 -0800] "GET /data/island_8k.zip HTTP/1.1" 206 42725451 211.162.235.226 - - [03/Feb/2007:01:08:12 -0800] "GET /data/island_8k.zip HTTP/1.1" 206 37503695 211.162.235.226 - - [03/Feb/2007:01:08:13 -0800] "GET /data/island_8k.zip HTTP/1.1" 206 15765857 211.162.235.226 - - [03/Feb/2007:01:08:13 -0800] "GET /data/island_8k.zip HTTP/1.1" 206 21176362 If those were real bytes transferred, it would be something like 400 MB/second. This is to distant users happy to get K/second. So, this raises a couple of interesting questions. 1. Is HE.net making a mistake in using access_log to count bandwidth? 1a. What would be the right way to do it?1b. Are they doing it this way because they're using some existing tool that does it this way? 1c. Are other hosts doing it this way, and therefore are mistakenly over-measuring and over-charging customers?
2. What exactly does the number after the 206 code in the access_log mean? Is it simply the range the client _requested_ via the Range header? In which case, it has no real relationship to how much data was _actually_ transferred?
Appreciate any insight from anyone. As I said before, we resolved the billing problem with HE.net without any problem, and have been very happy with their service. But, if this is a common mistake (bandwidth measurement via access_log) then the nature of the misunderstanding should be made more public so that others can ensure they aren't burned by it.
I apologize for any mistakes in terminology or assumptions in my message. I'm not an Apache guru and I don't play one on TV. I'm a 3D graphics programmer.
-- Chris 'Xenon' Hanson | Xenon @ 3D Nature | http://www.3DNature.com/ "I set the wheels in motion, turn up all the machines, activate the programs, and run behind the scenes. I set the clouds in motion, turn up light and sound, activate the window, and watch the world go 'round." -Prime Mover, Rush. --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx