Re: [RFC PATCH v2] docs-rst: automatically convert Graphviz and SVG images

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 28 Nov 2016, Markus Heiser <markus.heiser@xxxxxxxxxxx> wrote:
> Hi Jon,
>
> you forgot to reply ML (and other CC) ... so I will do it here.
>
> Am 27.11.2016 um 18:54 schrieb Jonathan Corbet <corbet@xxxxxxx>:
>
>> On Sun, 27 Nov 2016 16:25:43 +0100
>> Markus Heiser <markus.heiser@xxxxxxxxxxx> wrote:
>> 
>>> Replacement for the sphinx ``figure`` and ``images`` directive.
>>> 
>>> A image (or figure) directive which make use of the *glob* notation::
>>> 
>>>  .. figure::  hello.*
>>> 
>>> will be converted automatically with the tool chains listed below.:
>>> 
>>> * DOT to SVG: ``DOT(1)`` (http://www.graphviz.org) If graphviz is not
>>>  available, the DOT language is inserted as literal-include.
>>> 
>>> * SVG to PDF:
>>> 
>>>  * CairoSVG (http://cairosvg.org) if installed or alternatively
>>>  * ``convert(1)``: ImageMagick (https://www.imagemagick.org)
>>> 
>>> Signed-off-by: Markus Heiser <markus.heiser@xxxxxxxxxxx>
>> 
>> So I think this is headed in the right direction.  I have a few thoughts,
>> though...
>> 
>> - Should we really replace the existing sphinx directives, or make a new
>>  one?  I'd be inclined toward the latter just to minimize the potential
>>  for confusion.
>
> Since this is for "figure" AND "image" (inline) I thought it is
> better to replace. But I'am open to change this .. concrete ideas?

Please don't shadow Sphinx directives. Otherwise people will look for
Sphinx documentation for the directive, and will be confused that it
doesn't do what the documentation says. And we'll be able to add our own
directive options to our own extensions without adding to the confusion.

>> - We don't want to do the conversion unconditionally, right?  SVG works
>>  just fine for HTML output, we just need to do it for latex, as I
>>  understand it.
>
> Yes, you are right. But Sphinx (docutils) is split into a reader and
> a writer part. For more details see "Build Phases" in the Sphinx
> extension tutorial:
>
>   http://www.sphinx-doc.org/en/1.4.9/extdev/tutorial.html#build-phases
>
> This patch applies to the reading-phase, where it is not clear if you
> build HTML or LaTeX output. Think about the environment with the parsed
> doctrees, which is build up in this phase and (re-)used for any output.
>  
> When it comes to phase 4 (Writing) the "writer" (latex2e, html, etc) 
> decide which formats of images will be the best, for this the writer
> looks at the reST-source folder which formats available.
>
> OK, by this explanation, you see, we have to convert in the writer
> phase, but this will bring some other drawbacks. E.g. we have 
> to touch all writers (html, latex2e etc.), this means we have to write
> our own builders for for each output format. If we wan't to be academic
> this might be the best solution, but IMO maintaining own builder
> is not what we want, so I choose the more *pragmatic* solution,
> creating all potential formats in he reading phase  ...  any ideas ?

Are we just doing an unnecessary conversion when we're only building
HTML, or will HTML also use the converted files while it could use the
source format directly?

>> - I don't get the ".*" syntax.  Why not just list the source file by its
>>  actual name, then do the magic to use the converted version (if needed)
>>  behind the scenes?
>
> My thought was: 
>
>   with ".*" the author marks this image for 'auto-conversion'.
>
> Sphinx itself does not convert any images, Sphinx's concept
> normally is, that the author commits all image formats to the
> source tree ... but this is not what Linus wants to see in the
> source tree.

There's a ton of extensions to take input in the source tree in format
x, and produce format y. And it seems to me Sphinx *extensions* can tell
Sphinx to use files from the output tree, even if .rst files themselves
can only reference the source tree.

I think the directive argument should explicitly be the reference to the
file in the source tree without globbing. The directive will then do the
right thing with the source figure/image file.

> With 'auto-conversion' and preprocessing content in general
> we often have trouble ....
>
> Sorry, when I go a bit OT, but often I think, that the Linux build
> and the Sphinx (reST) concept are orthogonal, when it comes to file
> locations and this is why we get so often in trouble with 
> 'auto-conversion' and preprocessing content in general ...
>
> a.) Linux build places all intermediate files under the $PWD folder,
>    which might be O=/tmp/kernel or something else
>
> b.) Sphinx build expected all files under the 'sourcedir' folder
>
>  Usage: sphinx-build [options] sourcedir outdir [filenames...]
>
> And if we look at the ".pyc" discussion it is the same:
>
> c.) Linux places ".obj" files under the $PWD folder
>
> d.) python byte-compiles the files *in-place*, which means in 
>    kernel's source, even if O=/tmp/kernel is set or not
>
> Thats why I voted for "to copy" files in [1], even if I'am
> not happy with such copy solutions ... 
>
> [1] https://www.mail-archive.com/linux-doc@xxxxxxxxxxxxxxx/msg07655.html
>
> I don't want to rewind to this discussion, I agree with what Jani
> says, I only wan't to remember about the deviation of these two build
> systems.

Markus, if you don't want to rehash an old discussion, then
just... don't. Right? ;)

As I said, unless I'm grossly mistaken, a Sphinx *extension* can read a
file in format x in the Sphinx (and Linux) source tree, convert it to
format y and write it to the Sphinx (and Linux) build tree, and have
Sphinx use the file from there.

It's just that having a Makefile do this before Sphinx invocation is
problematic, because we'd have to write the output in the source tree to
have Sphinx read it from there.

>>> Documentation/conf.py              |   2 +-
>>> Documentation/doc-guide/hello.dot  |   3 +
>>> Documentation/doc-guide/sphinx.rst |  18 ++++++
>>> Documentation/sphinx/figure.py     | 123 +++++++++++++++++++++++++++++++++++++
>>> 4 files changed, 145 insertions(+), 1 deletion(-)
>>> create mode 100644 Documentation/doc-guide/hello.dot
>>> create mode 100644 Documentation/sphinx/figure.py
>>> 
>>> diff --git a/Documentation/conf.py b/Documentation/conf.py
>>> index 1ac958c..386d792 100644
>>> --- a/Documentation/conf.py
>>> +++ b/Documentation/conf.py
>>> @@ -34,7 +34,7 @@ from load_config import loadConfig
>>> # Add any Sphinx extension module names here, as strings. They can be
>>> # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
>>> # ones.
>>> -extensions = ['kerneldoc', 'rstFlatTable', 'kernel_include', 'cdomain']
>>> +extensions = ['kerneldoc', 'rstFlatTable', 'kernel_include', 'cdomain', 'figure']
>>> 
>>> # The name of the math extension changed on Sphinx 1.4
>>> if major == 1 and minor > 3:
>>> diff --git a/Documentation/doc-guide/hello.dot b/Documentation/doc-guide/hello.dot
>>> new file mode 100644
>>> index 0000000..504621d
>>> --- /dev/null
>>> +++ b/Documentation/doc-guide/hello.dot
>>> @@ -0,0 +1,3 @@
>>> +graph G {
>>> +      Hello -- World
>>> +}
>>> diff --git a/Documentation/doc-guide/sphinx.rst b/Documentation/doc-guide/sphinx.rst
>>> index 96fe7ccb..6f8fdd1 100644
>>> --- a/Documentation/doc-guide/sphinx.rst
>>> +++ b/Documentation/doc-guide/sphinx.rst
>>> @@ -217,3 +217,21 @@ Rendered as:
>>>       * .. _`last row`:
>>> 
>>>         - column 3
>>> +
>>> +Figures
>>> +=======
>>> +
>>> +If you want to add an image on either Graphviz or SVG format, you should
>>> +use Sphinx ``figure``, where the name of the image file should end with
>>> +``.*``, like::
>>> +
>>> +  .. figure::  hello.*
>>> +     :alt:    hello world
>>> +
>>> +     DOT's hello world example
>>> +
>>> +
>>> +.. figure::  hello.*
>>> +   :alt:    hello world
>>> +
>>> +   DOT's hello world example
>>> diff --git a/Documentation/sphinx/figure.py b/Documentation/sphinx/figure.py
>>> new file mode 100644
>>> index 0000000..50d63bd
>>> --- /dev/null
>>> +++ b/Documentation/sphinx/figure.py
>>> @@ -0,0 +1,123 @@
>>> +# -*- coding: utf-8; mode: python -*-
>>> +# pylint: disable=W0141,C0113,C0103,C0325
>>> +u"""
>>> +    images
>>> +    ~~~~~~
>>> +
>>> +    Replacement for the sphinx ``figure`` and ``images`` directive.
>>> +
>>> +    :copyright:  Copyright (C) 2016  Markus Heiser
>>> +    :license:    GPL Version 2, June 1991 see Linux/COPYING for details.
>>> +
>>> +    List of customizations:
>>> +
>>> +    * generate PDF from SVG
>>> +
>>> +    * generate SVG / PDF from DOT files, see
>>> +      http://www.graphviz.org/content/dot-language
>>> +
>>> +    A image (or figure) directive which make use of the *glob* notation will be
>>> +    converted automatically with the tool chains listed below.:
>>> +
>>> +    * DOT to SVG: ``DOT(1)`` (http://www.graphviz.org) If graphviz is not
>>> +      available, the DOT language is inserted as literal-include.
>>> +
>>> +    * SVG to PDF:
>>> +
>>> +      * CairoSVG (http://cairosvg.org) if installed or alternatively
>>> +      * ``convert(1)``: ImageMagick (https://www.imagemagick.org)
>>> +"""
>>> +
>>> +import os
>>> +from glob import glob
>>> +import subprocess
>>> +
>>> +from docutils import nodes
>>> +from docutils.parsers.rst import directives
>>> +from docutils.parsers.rst.directives import images
>>> +
>>> +from sphinx.directives import patches
>>> +
>>> +try:
>>> +    import cairosvg
>>> +except ImportError:
>>> +    cairosvg = None
>>> +
>>> +def cmd_exists(cmd):
>>> +    exit_code = subprocess.call(
>>> +        "type " + cmd, shell = True,
>>> +        stdout = subprocess.PIPE, stderr = subprocess.PIPE )
>>> +    return bool(exit_code == 0)
>> 
>> So I would really like to find a way to get rid of all "shell=True"
>> invocations.  Here you're using a shell builtin, which makes it harder, I
>> guess.  Still, a proper solution is to search through the path for a
>> suitable executable file; there must be a Python library that will do
>> that?
>
> Should not be hard ... coming soon.

You should just try to run the command (perhaps with --version
argument), and see what happens. There is no need to explicitly search
the path at all. Likely you should be using Popen and
.communicate(). And you should pass the cmd and args as a sequence, not
as a string.

>> 
>>> +
>>> +convert_exists = cmd_exists('convert')
>>> +dot_exists = cmd_exists('dot')
>> 
>>> +def setup(app):  # pylint: disable=W0613
>>> +    directives.register_directive('figure', Figure)
>>> +    directives.register_directive('image', Image)
>>> +
>>> +def asterix_conversions(folder, node):
>> 
>> "Asterix" is an indomitable Gaul; "asterisk" is "*" :)  But again, I'm
>> not convinced we need this step at all.
>> 
>
> OMG :-)

:D

>
>
>>> +    if not node['uri'].endswith('.*'):
>>> +        return node
>>> +    name = folder + os.path.sep + os.path.splitext(node['uri'])[0]
>>> +
>>> +    fnames = glob(name + '.*')
>>> +    if (name + '.dot' in fnames):
>>> +        if not dot_exists:
>>> +            with open(name + '.dot', "r") as dot:
>>> +                data = dot.read()
>>> +                node = nodes.literal_block(data, data)
>>> +        else:
>>> +            dot2svg(name)
>>> +
>>> +    fnames = glob(name + '.*')
>> 
>> Assuming we keep this, why glob the same pattern twice?
>
> The "dot2svg(name)" in the line above might create a SVG-file
> and this should be converted to PDF also ... 

Again, I think we should use explictly the right filenames without
globbing.

>
>> 
>>> +    if (name + '.svg' in fnames):
>>> +        svg2pdf(name)
>>> +    return node
>>> +
>>> +
>>> +def dot2svg(fname):
>>> +    cmd = "dot -Tsvg %s.dot" % fname

cmd should be a sequence.

>>> +    with open(fname + '.svg', "w") as svg:
>>> +        exit_code = subprocess.call(
>>> +            cmd, shell = True,
>>> +            stdout = svg,
>>> +            stderr = subprocess.PIPE )
>>> +        svg.flush()
>>> +    return bool(exit_code == 0)
>> 
>> ...and this is just why I don't want shell=True.  If somebody slips in a
>> bogus figure directive, they just ran an arbitrary command here.  It
>> really shouldn't be necessary, especially if you've already located the
>> dot binary.
>
> Will be replaced ...
>
>>> +
>>> +def svg2pdf(fname):
>>> +
>>> +    if cairosvg:
>>> +        cairosvg.svg2pdf(url = fname + '.svg', write_to = fname + '.pdf')
>>> +        return True
>>> +
>>> +    if convert_exists:
>>> +        cmd = "convert %s.svg %s.pdf" % (fname, fname)

cmd should be a sequence.

>>> +        exit_code = subprocess.call(
>>> +            cmd, shell = True,
>>> +            stdout = subprocess.PIPE, stderr = subprocess.PIPE )
>>> +        return bool(exit_code == 0)
>>> +
>>> +    return False
>>> +
>>> +class Image(images.Image):
>>> +
>>> +    def run(self):
>>> +        result = images.Image.run(self)
>>> +        if len(result) == 2 or isinstance(result[0], nodes.system_message):
>>> +            return result
>>> +        folder = os.path.dirname(self.state.document.current_source)
>>> +        result[0] = asterix_conversions(folder, result[0])
>>> +        return result
>>> +
>>> +class Figure(patches.Figure):
>>> +
>>> +    def run(self):
>>> +        result = patches.Figure.run(self)
>>> +        if len(result) == 2 or isinstance(result[0], nodes.system_message):
>>> +            return result
>>> +        folder = os.path.dirname(self.state.document.current_source)
>>> +        result[0][0] = asterix_conversions(folder, result[0][0])
>>> +        return result
>>> +
>> 
>> Thanks for looking at this,
>
>
> Thanks for your thoughts and comments :-)

I didn't spot the extension notifying Sphinx about the build dependency
on the source files. The documentation won't be rebuilt if the source
files change.


BR,
Jani.


>
>
> --Markus--
>
>

-- 
Jani Nikula, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux