Me again :) Some progress, hopefully this might be of use to anyone else who runs into this problem. I have managed to replicate the issue 100% of the time by using the '--limit' parameter in wget. This produced some very interesting results. When limiting the request connection rate to 20 kbps, and requesting the PDF, the file would fail at around 22% of downloading, and returned the following in Apache error log with debug on.
snip
[Tue Jul 17 11:18:50 2007] [debug] mod_disk_cache.c(1043): disk_cache: Body for URL http://sitename/mypdf.pdf? cached. [Tue Jul 17 11:18:50 2007] [debug] mod_proxy_http.c(1537): proxy: end body send [Tue Jul 17 11:18:50 2007] [debug] proxy_util.c(1816): proxy: HTTP: has released connection for (*) <<snip The failed request would then remain in the cache. When forcing a cache refresh by updating the Last Modified time, and not limiting the connection the entire file was downloaded, but not held in cache due to ..
snip
7 11:30:10 2007] [debug] mod_disk_cache.c(1007): cache_disk: URL http://sitename/mypdf.pdf? failed the size check (1000872 > 1000000) <<snip The was quickly resolved by setting the CacheMaxFileSize to 10MB. After this, Apache was happy to serve the Entire PDF at 20kbps if the file was present in cache, but suffers from the same problem if the file is not in cache and has to goto Tomcat. What is interesting is when checking the Tomcat access log, there is a delay between my request, and the log entry, it appears that Tomcat is deciding how much of the file to send depending on my connection speed, when changing my connection speed Tomcat changed the amount request size in the log entry, and the connection would fail when hitting the amount number of bytes displayed in the access log. All I need to do now is stop Tomcat from attempting to do the byte range server, and I think the issue will be resolved. Regards, Mark. On 17/07/07, Mark Stevens <mark.stevens99@xxxxxxxxxxxxxx> wrote:
Hi Jacqui, Thanks for the response, Initially I suspected the issue could have been related to client type, however I was able to create a broken item in the cache by running the following command from a remote server. - wget -S --no-cache http://sitename/mypdf.pdf?<random number>' and then kept changing the random number until I received HTTP 416 (Requested range not satisfiable) On getting the 416 response, the item would remain in the cache smaller than expected size when attempting to view via browser, response is 'the file is damaged and could not be repaired'. Being I was testing against the live site, it is possible someone had send a bytes range request during my testing with wget, and then Maybe Apache stored the bytes range as the entire item. I'll continue testing and let you know how I get on, if I don't get a resolve soon, I will try rolling back to Apache 1.3 to rule out mod_cache as being the culprit. Thanks again, Mark. On 17/07/07, Jacqui caren <jacqui.caren@xxxxxxxxxxxx> wrote: > Mark Stevens wrote: > > Anyone? > > It is likely that PDF viewers will ask for byteranges. > > If the cache is storing what is requested rather that the entire > file, then this make sense. IIRC mod_proxy does the correct thing > (requests the byterange it does not have and put chunks together then > serves the requested range). > > PDFs were designed so that the TOC is at the head of the document. > If you find that you are only storing the first NNNN bytes > and then only sporadic contiguous chunks I would assume > byterange requests are the problem and hand code a number > of test requests to confirm it. > > HTH > > > On 16/07/07, Mark Stevens <mark.stevens99@xxxxxxxxxxxxxx> wrote: > > > >> Has anyone had problems in the past with Apache mod_cache storing > >> incomplete versions of files such as PDF's, and if so did you manage > >> to resolve it? > >> > >> The problem is intermittent, and I can confirm PDF's from the origin > >> source are OK. > >> > >> I would be interested in any combination of setup and version of > >> Apache you may have seen this with. > >> > >> I posted something related to this issue regarding removal of > >> individual files from cache, sorry if this is seen as double posting, > >> but felt I ought to have been more direct. > >> > >> > >> Many thanks in advance. > >> > >> Mark. > >> > > --------------------------------------------------------------------- > The official User-To-User support forum of the Apache HTTP Server Project. > See <URL:http://httpd.apache.org/userslist.html> for more info. > To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx > " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx > For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx > >
--------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx