On Fri, 10 Dec 2010, J.H. wrote: >>> This is fairly self explanatory, these are here just to centralize the checking >>> for these types of actions, as special things need to be done with regards to >>> them inside the caching engine. >>> >>> isBinaryAction() returns true if the action deals with creating binary files >>> (this needing :raw output) >> >> Why do you need special case binary / :raw output? It is not really >> necessary if it is done in right way, as shown in my rewrite. > > Because that's not how my caching engine does it, and the reason for > that is I am mimicking how the rest of gitweb does it. To shorten the explanation why treating binary (needing :raw) output in a special way is not necessary: with the way gitweb code is structured (with "binmode STDOUT, ':raw'" inside action subroutine), with the way capturing output is done (by redirecting STDOUT), and even with the way kernel.org caching code is structured the only thing that needs to be done to support both text (:utf8, as set at beginning of gitweb) and binary (:raw) output is to *dump cache to STDOUT in binary mode*: binmode $cache_fh, ':raw'; binmode STDOUT, ':raw'; File::Copy::copy($fh, \*STDOUT); Nothing more. Just dump cache file to STDOUT in binary mode. > I attempted at one point to do as you were suggesting, and it became too > cumbersome. I eventually broke out the 'binary' packages into a special > case (thus mimicking how gitweb is already doing things), which also > gives me the advantage of being able to checksum the resulting binary > out of band, as well as being able to more trivially calculate the file > size being sent. I don't see how it needs to be special-cased: the ordinary output would also take advantage of this. Note that plain 'blob' action can also be quite large. If there is to be done smarter, i.e. HTTP-aware, parsing and dumping of cache entry file, e.g. by reading the HTTP header part to memory and fiddling with HTTP headers (e.g. adding Content-Length header), it can be done in a contents-agnostic way. Note that with the way I do it in my rewrite, namely saving cached output to temporary file to rename it to final destination later (atomic update), we can do mungling of HTTP headers before/during this final copying to final file, e.g. calculating Content-Length and perhaps Content-MD5 headers. > >>> isFeedAction() returns true if the action deals with a news feed of some sort, >>> basically used to bypass the 'Generating...' message should it be a news reader >>> as those will explode badly on that page. >> >> Why blacklisting 'feed', instead of whitelisting HTML-output? > > There are a limited number of feed types and their ilk (standard xml > formatted feed and atom), there are lots of html-output like things. > Easier to default and have things work, generally, than to have things > not work the way you would expect. Ah, I see from what you written in other subthreads of this thread that you prefer to have "Generating..." page where it is not wanted that not have it where it could be useful (i.e. blacklist approach), while I took the opposite side (i.e. whitelist approach). -- Jakub Narebski Poland -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html