Re: [PATCH] gitweb: return correct HTTP status codes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Lea Wiemann <lewiemann@xxxxxxxxx> writes:

I would perhaps choose "gitweb: standarize HTTP status codes"
as the subject, but the one you chosen is also good.

> Many error status codes simply default to 403 Forbidden, which is not
> correct in most cases.  This patch makes gitweb return semantically
> correct status codes.

I'm not sure if "403 Forbidden" is not correct error code in the
absence of further information.

I'd like to have reasoning when, and why, we use which responses (HTTP
status codes) in which situations.  Link to appropriate RFC you used
(and perhaps to some other information) wouldn't be out of the
question.  See also below.

> For convenience the die_error function now only takes the status code
> without reason as first parameter (e.g. 404 instead of "404 Not
> Found"), and it now defaults to 500 (Internal Server Error), even
> though the default is not used anywhere.

Well, gitweb sometimes used different wording (different explanation)
for the same error message / HTTP status code, for example it is
generic "403 Forbidden", but when trying to access page which is not
available due to some restrictions (like feature being disabled) it is
"403 Permission Denied".

> Signed-off-by: Lea Wiemann <LeWiemann@xxxxxxxxx>
> ---
> I recommend that you apply this patch beforehand:
> [PATCH] gitweb: remove git_blame and rename git_blame2 to git_blame
> http://article.gmane.org/gmane.comp.version-control.git/84036/raw

First, why?  

Second, I think it is not possible, as IIRC removal of git-annotate
based git_blame got already applied.

> In some cases I've simply trusted the existing message (so when the
> message said "not found", I would make it a 404 error without checking
> whether the message is accurate).  Also, in some cases an overly
> generic 500 Internal Server Error is returned, e.g. in several cases
> like this one ...
> 
>  	open my $fd, "-|", git_cmd(), "ls-tree", $base, "--", $path
> -		or die_error(undef, "Open git-ls-tree failed");
> +		or die_error(500, "Open git-ls-tree failed");

Should we really use "500 Internal Server Error" here?  Usually this
would be not an error at all, but wrong parameters to git command,
i.e. it is _user_ who erred, not some error on _server_ side.  I would
use "500 Internal Server Error" if for example git binary could not be
found...

> ... I suspect that the error could be triggered by non-existent hashes
> as well, in which case the more specific 404 would be more appropriate
> -- however, the current implementation oftentimes doesn't allow for
> more fine-granulared checking, and I don't want to add actual code as
> long as we don't have tests.  So we'll go with catch-all 500 errors in
> the meantime; it's not a regression, at least.

...that is why I'd rather have "403 Forbidden" as catch-all error, as
it was done in gitweb.  But that also seems not very good idea.

It seems like this is a matter of taste.

> Jakub: Let's make this patch a prerequisite for the Mechanize tests so
> we can test for exact status codes rather than [45][0-9][0-9].

Could you please Cc me if you address me in email?  I am not
subscribed to git mailing list, I read it via GMane NNTP (Usenet news)
interface; and even if I had been subscribed it is better to Cc people
who might be interested (in the case of gitweb caching it would be
probably me, Petr Baudis, John Hawley, perhaps Luben Tuikov).

[...]
>  sub die_error {
> -	my $status = shift || "403 Forbidden";
> -	my $error = shift || "Malformed query, file missing or permission denied";
> +	my $status = shift || 500;
> +	my $error = shift || "Internal server error";

Errr, I think "Malformed query, file missing or permission denied" is
actually closer to truth, and better error message (it is displayed in
body of message)
  
> -	git_header_html($status);
> +	# Use a generic "Error" reason, e.g. "404 Error" instead of
> +	# "404 Not Found".  This is permissible by RFC 2616.
> +	git_header_html("$status Error");
>  	print <<EOF;

Even if it is _permissible_ by RFC 2616, I don't think it is
_recommended_ practice.  IIRC it is recommnded for some (all?) HTTP
error status codes to give more detailed explanation in the body of
message.

==================================================================

Proposed usage of HTTP server/client error status codes in gitweb:

1. Basic CGI parameter validation; this catches for example invalid
   action names and invalid order names, pathnames which doesn't look
   like good pathnames, hash* parameters which doesn't look like
   refnames, extra option parameters not on allowed list, page which
   is not a number, unknown action, etc.

   These should probably all return "400 Bad Request", or perhaps 
   "404 Not Found" as catch-all, not "403 Forbidden".

2. Gitweb cannot find requested resource; this includes requesting
   project which does not exists (and, for security reasons, also
   those excluded from gitweb by different means), tag does not exists
   etc.  I'm not sure if include there situations where for example
   filename is missing for a 'blob' view request.

   Here "404 Not Found" would be most valid.

3. Some required parameters are missing, for example action other than
   projects list (in any format) and no project provided, 

   I think that here "400 Bad Request" is probably best solution.

4. Project list is requested, but there are no projects (or no forks)
   to be displayed by gitweb.

   This is a strange situation indeed, and I'm not sure what proper
   HTTP code should be used.  I don't think that "404 Not Found" would
   be good solution...

5. When some feature is requested that is not enabled, but with
   different gitweb configuration would be available.  

   Authorization wouldn't change the response, so I guess best would
   be "403 Forbidden" or "403 Permission Denied".

6. Request is made for a wrong type of object, for example 'blame' on
   object which is not a blob.

   Here I guess "400 Bad Request" would be good solution (if not 404).

7. When the git command failed, and we don't know the reason...

-- 
Jakub Narebski
Poland
ShadeHawk on #git
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux