Re: [PATCH] gitweb.cgi: Use File::MMagic; "a=blob" action knows the blob/file type

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Luben Tuikov <ltuikov@xxxxxxxxx> writes:

> Use File::MMagic to determine the MIME type of a blob/file.
> The variable magic_mime_file holds the location of the
> "magic.mime" file, usually "/usr/share/file/magic.mime".
> If not defined, the magic numbers internally stored in the
> File::MMagic module are used.

I am sorry to ask you this, but would you mind redoing this
patch without File::MMagic bits?  I think giving "a=blob" an
ability to automatically switch to git_blob_plain is a good
addition (as is your earlier patch to give a direct link to
reach blob_plain from the list), so let's have that part in
first.  I haven't applied your earlier one but it will appear in
"next" shortly.

Existing filename based mimetypes_guess should be a lot cheaper
than exploding a blob and feeding it to File::MMagic.  I was
hoping File::MMagic to be used when we cannot guess the content
type that way (i.e. when mimetypes_guess returns undef or
application/octet-stream).

Since the repository owner can correct misidentification by the
standard /etc/mime.types by supplying a custom per-repository
$mimetypes_file (modulo that the current implementation of
mimetype_guess_file does not allow it if the file does not have
an extension that is specific enough), File::MMagic might be an
overkill, especially if used in the way this patch does.  To
allow finer grained differentiation that cannot be done with
file extensions alone (e.g. some files may have .dat extension
but one can be VCD mpeg wrapped in RIFF, and another can be a
Z-machine story file), it might be simpler to allow the
repository owner to specify full $file_name for such an ambiguous
file in their custom $mimetypes_file, and try to match it in
mimetype_guess_file sub.  That way we may not even need to use
File::MMagic.

Are there cases where only $hash is given without $file_name?
If so we may need to fall back on File::MMagic in such a case
after all, but get_blob_mimetype sub copies the whole blob to a
temporary file to work around a problem with version 1.27 you
state in the comment -- this is way too much (and nobody seems
to clean up the tempfile).  Looking at magic.mime, I suspect we
might be able to get away with the first 4k bytes or so at most
(the largest offset except iso9660 image is "Biff5" appearing at
2114 to signal an Excel spreadsheet).

-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]