Re: apache fails to show jpg and not find files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



deh wrote:

awarnier wrote:
Hi.

Maybe the very first thing you need to do, if you are going to use your email program to post to lists such as this one, is to turn off all these nice features like "view as html" and "send as html". View and compose and send as plain text, or you are going to confuse yourself and others no end, specially if what you want to include is html is the first place.

The second thing is to just decide once and for all how you download the pages from the original server, and then stick to one way for now. In other words, you have downloaded the pages one way or another, and they are now as they are, and we will try to understand what is going on. If you keep on changing the contents of the pages as we are trying to help, you are going to get everyone confused again, including yourself.

Next, what you show in your log below and seem to consider as a problem (accesses to a directory instead of a file), is actually normal. When the browser asks Apache for a document at "/a/b/c/d/index.html", Apache will explore this whole path, element by element. So it will first look for "/a", and if it doesn't even find that, it will log an error in the log for "/a" and not go any further. Similarly, if it finds "/a" and then "/a/b", and then "/a/b/c", but then not "/a/b/c/d", it will log an error for that, and never even look for "/a/b/c/d/index.html".

Next, Apache itself will do fine with links as long as you want, as long as it can actually find what the browser is telling it to find under the DocumentRoot. The user-id under which Apache is running also needs to be able at least to read all these directories and files. So verify this, so that we are not chasing the wrong issue (we don't know which lines you are /not/ showing us from your logs).

And finally, the important part is to figure out what the browser is actually asking for.
You can figure that from your Apache access logs.
Check first if the access logs actually shows accesses to files that really exist on your disk, where the browser is asking for them. Either the browser is asking for the wrong thing (due to incorrect links in the pages), or else the browser is asking for the right thing, but the asked-for document really isn't there.
What is it ?

Suggestion :
- stop Apache
- delete all the logs
- start Apache
- in your browser, start with the very top document, then step by step check your access log, verifying that what you think the browser should be asking for at each browser click, is really what it is asking for.

At the first discrepancy, stop and post the relevant access log lines.


deh wrote:
Krist van Besien wrote:
On Wed, Apr 8, 2009 at 5:32 AM, deh <dhaselwood@xxxxxxxxxxx> wrote:
I wanted to setup my web pages that are on verizon.net on my local
network
with a machine running Suse 11.1/apache2.  I downloaded the web pages
with
'wget -a -k' into a directory and set the directory/root and directory
for
the apache .conf file to the directory holding 'index.html' in the
downloaded directories and set the permissions.  When the server is
accessed, the web page text presents, but there are only boxes for the
.jpg,
.jpeg files and the one case where there is a file to be downloaded it
shows
"Object not Found, Error 404".

If I access 'index.html' from the browser (Konqueror) on the server
machine
everything is correct, so it looks like the paths to the files are
being
handled differently with apache than the browser.

I'm new at this and need so direction as to where to look.
In your case it is probly the -k option to wget that is the problem.
This option tells wget to convert all hyperlinks so that they are
suitable for local viewing. This is why you can see your site in
konqueror.
In order to mirror your site locally just get all the html files using
a file transfer client, so that you get exactly the same as on your
server.

If you are still experiencing 404 errors afterwards the way to start
solving these is to look in the error log. If you don't understand
what you see there, you can come back here and ask us :-)

Krist

--
krist.vanbesien@xxxxxxxxx
krist@xxxxxxxxxxxxx
Bremgarten b. Bern, Switzerland
--
A: It reverses the normal flow of conversation.
Q: What's wrong with top-posting?
A: Top-posting.
Q: What's the biggest scourge on plain text email discussions?

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server
Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
   "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx


Krist,

Thanks for the response.
Dropping the '-k' option didn't fix the problem.  Here is a snip of
error_log output as well as a look at the html file.  At the moment it
looks
like apache2 is truncating the path/file name.
error_log
Using webpage downloaded with wget -r
[Wed Apr 08 12:32:23 2009] [error] [client 10.143.15.6] File does not
exist:
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib, referer:
http://10.143.15.1:41574/
[Wed Apr 08 12:32:25 2009] [error] [client 10.143.15.6] File does not
exist:
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib, referer:
http://10.143.15.1:41574/
[Wed Apr 08 12:32:25 2009] [error] [client 10.143.15.6] File does not
exist:
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/sitebuildercontent,
referer: http://10.143.15.1:41574/
[Wed Apr 08 12:32:26 2009] [error] [client 10.143.15.6] File does not
exist:
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/favicon.ico,
referer:
http://10.143.15.1:41574/

error_log using webpage downloaded with wget -r -k
[Wed Apr 08 13:25:24 2009] [error] [client 10.143.15.6] File does not
exist:
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib, referer:
http://10.143.15.1:41574/
[Wed Apr 08 13:25:31 2009] [error] [client 10.143.15.6] File does not
exist:
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib, referer:
http://10.143.15.1:41574/
[Wed Apr 08 13:25:31 2009] [error] [client 10.143.15.6] File does not
exist:
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/sitebuildercontent,
referer: http://10.143.15.1:41574/
[Wed Apr 08 13:25:39 2009] [error] [client 10.143.15.6] File does not
exist:
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/sitebuildercontent,
referer: http://10.143.15.1:41574/

The problem is that these only have a partial path, and no file name.

Line from file--
~/webpage/mysite.verizon.net/res7yvp2/w4dh22/index.html
in the web page downloaded with--
wget -r
[I changed "<" to "#" since with "<" Preview Message didn't show the
path/file]
<td width="5">#img src="/imagelib/sitebuilder/layout/spacer.gif"
width="1"
height="1" alt=""><br></td>

Same line as foregoing from web page
downloaded with wget -r -k
<td width="5">#img src="../../imagelib/sitebuilder/layout/spacer.gif"
width="1" height="1" alt=""><br></td>

The latter path/file is correct, as the imagelib is up two levels from
the
directory holding index.html
For example, this is what I it should be--
/home/deh/webpage/mysite.verizon.net/imagelib/sitebuilder/layout/spacer.gif

Conclusion:
1) The '-k' option in 'wget' stores the correct path with respect to the
index.html (DirectoryRoot) path.
2) Neither works correctly with apache
3) In the error_log, the file name is missing and the path is incomplete

Could it be that the path is simply too long?

Don



---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
   "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx



awarnier,

Here is another go at this--

Here is the 1st error from error_log--
[Wed Apr 08 15:27:56 2009] [error] [client 10.143.15.10] File does not
exist: /home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib,
referer: http://10.143.15.1:41574/

Here is the access_log--
10.143.15.10 - - [08/Apr/2009:15:27:56 -0400] "GET / HTTP/1.1" 304 - "-"
"Opera/9.64 (X11; Linux i686; U; en) Presto/2.1.1"
10.143.15.10 - - [08/Apr/2009:15:27:56 -0400] "GET /index.html HTTP/1.1" 304
- "http://10.143.15.1:41574/"; "Opera/9.64 (X11; Linux i686; U; en)
Presto/2.1.1"
10.143.15.10 - - [08/Apr/2009:15:27:56 -0400] "GET
/imagelib/sitebuilder/layout/spacer.gif HTTP/1.1" 404 1145
"http://10.143.15.1:41574/"; "Opera/9.64 (X11; Linux i686; U; en)
Presto/2.1.1"

Here is where the file that was not found resides--
deh@PIII:~/webpage/mysite.verizon.net/imagelib/sitebuilder/layout> ls -l
total 16
-rwxr-xr-x 1 www users 43 2007-06-06 13:05 blank.gif
-rwxr-xr-x 1 www users 67 2007-06-06 13:05 spacer.gif

Here is where index.html resides and <DirectoryRoot> is set--
> deh@PIII:~/webpage/mysite.verizon.net/res7yvp2/w4dh22> ls -l

Here is a first problem :
- I presume <DirectoryRoot> is a typo, and you really mean that you have , in your httpd.conf,
  DocumentRoot /home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22
right ?
- assuming yes, the above directory is thus the DocumentRoot of your server. Any GET request URI is thus going to be interpreted by Apache as relative to that directory. The request for URI "/imagelib/sitebuilder/layout/spacer.gif" thus, is going to be interpreted by Apache as a request for
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib/sitebuilder/layout/spacer.gif
(one line)
which is obviously not the place where the file resides.
So Apache responds 404, and rightly so.


total 80
-rwxr-xr-x 1 www users 21461 2009-04-08 14:47 id2.html
-rwxr-xr-x 1 www users 30840 2009-04-08 14:47 id3.html
-rwxr-xr-x 1 www users 22716 2009-04-08 14:47 index.html

Here is the line from the html file--
<td width="257" bgcolor="#FFFFFF" colspan="2" rowspan="3" valign="top">
../../imagelib/sitebuilder/layout/spacer.gif <br></td>
[BTW, the html box is not checked, but if following line with the "<" ">"
included in above, it shows up correctly in the message entry box, but in
the Preview Message there is only a faint icon of a page.  Is there
someplace else where html gets turned on besides the where the message is
posted?]
img src="../../imagelib/sitebuilder/layout/spacer.gif" alt=""

This path/file is correct when starting from the directory with index.html.
The 'GET' in the access file would be correct if the "../../" was prepended
to the path/file.

That's where I think you have a slight misunderstanding.
The path
src="../../imagelib/sitebuilder/layout/spacer.gif"
is something that the /browser/ will interpret, but it will never send this as a URI to the server.

How does the browser "think" ?
Say it sends a first request to the server for "/", and it gets back a page. Now for the browser, "/" is the "base location" where it got this current page which it is busy displaying. Next, the browser, in that page that it is trying to display, finds a reference to another element, which it needs to display the page. That is the image tag
<img src="../../imagelib/sitebuilder/layout/spacer.gif">
The browser is going to try, using the "base location" and the (in this case relative) address of the image, to build an absolute URI requesting this element. But it knows that it cannot get any item that would be above the Root of the document tree, which is "/". So in this case it will just strip off the "../.." part of that relative URI, and send a request for
/imagelib/sitebuilder/layout/spacer.gif
which you see in your logs, and which the server is going to interpret as a request relative to your DocumentRoot, thus for
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib/sitebuilder/layout/spacer.gif
which does not exist, which --> 404.

It would be different if, for example, the browser had obtained the current page from a request to
/dir1/dir2/dir3/index.html
Then the above would be its "base location", and if in that page it found a refereence to
../../imagelib/sitebuilder/layout/spacer.gif, then it would do as follows :
- remove the last part of the base location (index.html), leaving "/dir1/dir2/dir3/"
- add to that the relative link found, giving
/dir1/dir2/dir3/../../imagelib/sitebuilder/layout/spacer.gif
- request that URI from the server.
- the server would then interpret that URI relative to your DocumentRoot
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22
which would in the end give a path of
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/dir1/dir2/dir3/../../imagelib/sitebuilder/layout/spacer.gif
which after eliminating the embedded ..'s would be
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/dir1/imagelib/sitebuilder/layout/spacer.gif
which hopefully would be correct.

Have I lost you yet ?

Now the beginning of a solution to your problem, if you want to avoid correcting your DocumentRoot (and maybe creating other issues), or modifying all the links in the pages.
Assuming that the directory
/home/deh/webpage/mysite.verizon.net/imagelib/
is the real base of all the image links in your pages, then add
the followng to your configuration :

Alias /imagelib/ /home/deh/webpage/mysite.verizon.net/imagelib/
<Directory /home/deh/webpage/mysite.verizon.net/imagelib/>
  Order Allow,Deny
  Allow from all
</Directory>

You need this Directory section, because it is in fact outside of your DocumentRoot hierarchy, and Apache will not normally allow you to get documents from there.

Stop Apache, clear your logs, start Apache and try again.

The real solution would be to have your DocumentRoot set to
/home/deh/webpage/mysite.verizon.net/
and make sure your top index page is there.
Then you could remove the Alias and <Directory> section above.


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
  "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx


[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux