Luben Tuikov <ltuikov@xxxxxxxxx> writes: > Use File::MMagic to determine the MIME type of a blob/file. > The variable magic_mime_file holds the location of the > "magic.mime" file, usually "/usr/share/file/magic.mime". > If not defined, the magic numbers internally stored in the > File::MMagic module are used. I am sorry to ask you this, but would you mind redoing this patch without File::MMagic bits? I think giving "a=blob" an ability to automatically switch to git_blob_plain is a good addition (as is your earlier patch to give a direct link to reach blob_plain from the list), so let's have that part in first. I haven't applied your earlier one but it will appear in "next" shortly. Existing filename based mimetypes_guess should be a lot cheaper than exploding a blob and feeding it to File::MMagic. I was hoping File::MMagic to be used when we cannot guess the content type that way (i.e. when mimetypes_guess returns undef or application/octet-stream). Since the repository owner can correct misidentification by the standard /etc/mime.types by supplying a custom per-repository $mimetypes_file (modulo that the current implementation of mimetype_guess_file does not allow it if the file does not have an extension that is specific enough), File::MMagic might be an overkill, especially if used in the way this patch does. To allow finer grained differentiation that cannot be done with file extensions alone (e.g. some files may have .dat extension but one can be VCD mpeg wrapped in RIFF, and another can be a Z-machine story file), it might be simpler to allow the repository owner to specify full $file_name for such an ambiguous file in their custom $mimetypes_file, and try to match it in mimetype_guess_file sub. That way we may not even need to use File::MMagic. Are there cases where only $hash is given without $file_name? If so we may need to fall back on File::MMagic in such a case after all, but get_blob_mimetype sub copies the whole blob to a temporary file to work around a problem with version 1.27 you state in the comment -- this is way too much (and nobody seems to clean up the tempfile). Looking at magic.mime, I suspect we might be able to get away with the first 4k bytes or so at most (the largest offset except iso9660 image is "Biff5" appearing at 2114 to signal an Excel spreadsheet). - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html