Re: Problems building the unifont PFA and DIT files for the PDF book

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024-04-20 09:52, G. Branden Robinson wrote:
At 2024-04-20T14:26:17+0200, Alejandro Colomar wrote:
First problem:

In the Unifont, I don't see a "Regular" font.  I assumed I should take
the unifont.otf file.

Hi folks,

That's the BMP ~63.5k characters ~57k glyphs; unifont_upper are the SMP ~57.5k glyphs with specialized scripts and extended graphics like emojis: unlikely to be required for any LGC man pages.

		https://unifoundry.com/unifont/index.html

Since (I believe I saw you say that) you're using GNU Unifont only to
patch up missing code point coverage from other fonts, in your
application it probably makes sense to specify it as a "special" font.

afmtodit(1):
      The -s option should be given if the font is “special”, meaning
      that groff should search it whenever a glyph is not found in the
      current font.  In that case, font‐description‐file should be listed
      as an argument to the fonts directive in the output device’s DESC
      file; if it is not special, there is no need to do so, since
      troff(1) will automatically mount it when it is first used.
[...]
      -s     Add the special directive to the font description file.

I see that the foregoing advice is incomplete: updating the output
device's "DESC" file is only one approach; another is to add a `special`
request to the document, and that's the one I suggest you take for your
man pages book.

So you might put

.special Unifont

in your front.groff file or similar.

Here's how I've been groff-ifying the Tinos font:
	AFMTODIT	.tmp/fonts/devpdf/TinosR
	afmtodit -e /usr/share/groff/current/font/devpdf/enc/text.enc .tmp/fonts/devpdf/TinosR.afm /usr/share/groff/current/font/devpdf/map/text.map .tmp/fonts/devpdf/TinosR
	/usr/local/bin/afmtodit: AGL name 'mu' already mapped to groff name 'mc'; ignoring AGL name 'uni00B5'
	/usr/local/bin/afmtodit: AGL name 'periodcentered' already mapped to groff name 'pc'; ignoring AGL name 'uni00B7'
	/usr/local/bin/afmtodit: both gravecomb and uni0340 map to u0300 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both acutecomb and uni0341 map to u0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both uni0313 and uni0343 map to u0313 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both uni02B9 and uni0374 map to u02B9 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both alphatonos and uni1F71 map to u03B1_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both epsilontonos and uni1F73 map to u03B5_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both etatonos and uni1F75 map to u03B7_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both iotatonos and uni1F77 map to u03B9_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both omicrontonos and uni1F79 map to u03BF_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both omegatonos and uni1F7D map to u03C9_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both Alphatonos and uni1FBB map to u0391_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both Epsilontonos and uni1FC9 map to u0395_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both Etatonos and uni1FCB map to u0397_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both iotadieresistonos and uni1FD3 map to u03B9_0308_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both Iotatonos and uni1FDB map to u0399_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both Upsilontonos and uni1FEB map to u03A5_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both dieresistonos and uni1FEE map to u00A8_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both Omicrontonos and uni1FF9 map to u039F_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both Omegatonos and uni1FFB map to u03A9_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both uni2000 and uni2002 map to u2002 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both uni2001 and uni2003 map to u2003 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both Ohm and uni2126 map to u03A9 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both uni1FE3 and upsilondieresistonos map to u03C5_0308_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both uni1F7B and upsilontonos map to u03C5_0301 at /usr/local/bin/afmtodit line 6586.
	/usr/local/bin/afmtodit: both patah and yodyod_patah map to u05B7 at /usr/local/bin/afmtodit line 6586.

Are any of those warnings something I should take care of?  Or should
I just ignore them?  If they're unimportant, can I ask that low
severity warnings not be printed?  Or should I just 2>/dev/null?

The afmtodit(1) man page, and groff's "PROBLEMS" file (in the source
distribution, since these warnings can arise when building groff)
address this point.  Whether it's a problem depends on what you wanted.

afmtodit(1):

Diagnostics
      AGL name 'x' already mapped to groff name 'y'; ignoring AGL name
      'uniXXXX'
             You can disregard these if they’re in the form shown, where
             the ignored AGL name contains four hexadecimal digits XXXX.
             The Adobe Glyph List (AGL) has its own names for glyphs;
             they are often different from groff’s special character
             names.  afmtodit is constructing a mapping from groff
             special character names to AGL names; this can be a one‐to‐
             one or many‐to‐one mapping, but one‐to‐many will not work,
             so afmtodit discards the excess mappings.  For example, if x
             is Delta, y is *D, and XXXX is 0394, afmtodit is telling you
             that the groff font description that it is writing cannot
             map the groff special character \[*D] to AGL glyphs Delta
             and uni0394 at the same time.

             If you get a message like this but are unhappy with which
             mapping is ignored, a remedy is to craft an alternative map‐
             file and re‐run afmtodit using it.

Well, apart from those warnings, that works.  Now, I repeat the process
with the Unifont:
[...]
	$ make build-pdf-book
	GROPDF		.tmp/man-pages-6.7-70-gd80376b08.pdf
	troff:.tmp/fonts/devpdf/UnifontR: error: font description 'spacewidth' directive missing
[...]
Did I do anything wrong with the Unifont?  I suspect of treating it as a
Regular font without any indication that I should.

No, you simply need to tell groff how wide a space should be in that
font.  In groff, a space is not a kind of glyph, because it doesn't put
any "ink" on the "page"; instead it's a (discardable) horizontal
motion.[1]  "Discardable" because if it occurs at the end of an output
line, it is discarded.

If the formatter didn't discard spaces
at the ends of output lines, that would
defeat adjustment to both  margins,  as
one  can  observe in this example here.
Note the ragged margin ending the first
line.

afmtodit(1):
      -w space‐width
             Use space‐width as the width of inter‐word spaces.

You will probably want to know what number to use for a font's space
width.  This is a judgment typographers make.  The groff Texinfo manual
and groff_diff(7) page share a rule of thumb.

      .ss word‐space‐size [additional‐sentence‐space‐size]
             A second argument sets the amount of additional space
             separating sentences on the same output line.  If omitted,
             this amount is set to word‐space‐size.  Both arguments are
             in twelfths of current font’s space width (typically one‐
             fourth to one‐third em for Western scripts; see
             groff_font(5)).  The default for both parameters is 12.
             Negative values are erroneous.

My approach is to generate the font description file _without_
the `-w` option, then read the resulting to file to see how wide the
glyphs are.

If I do this for the URW Times roman font:

$ grep '^M' build/font/devpdf/TR
M       889,662 2       77      M       --      004D

I can see that the "M" is 889 basic units wide (see groff_font(5) for an
explanation of this file format and its terminology).

One third of 889 (rounded to an integer) is 296, so, personally, I'd say
"-w 296".  But in principle, any value between 223 and 296 is "sound",
and ultimately, the "correct" value is whatever best pleases you as a
typographer when considering your document.  It's also worth noting that
when adjustment is enabled, as is the case in AT&T and GNU troffs by
default, an inter-word space will seldom be _exactly_ this "spacewidth"
in any case, except where the document (or a macro package) has
explicitly disabled adjustment.

OpenType fonts are normally designed with an 1000 units/em, and Truetype may be 1024 or 2048 units/em, so should use 333 or maybe 300 if you prefer a tighter look, close to your suggestion.

$ ttfdump /usr/share/fonts/urw-base35/NimbusRoman-Regular.otf | awk "/'head'/,/^$/"
   6. 'head' - checksum = 0x0cdb53f2, offset = 0x00016f4c, len =        54
   7. 'hhea' - checksum = 0x06b6057b, offset = 0x00016f84, len =        36
   8. 'hmtx' - checksum = 0x35d9ae6c, offset = 0x00016fa8, len =      3420
   9. 'maxp' - checksum = 0x03575000, offset = 0x00017d04, len =         6
  10. 'name' - checksum = 0x8993f63c, offset = 0x00017d0c, len =       620
  11. 'post' - checksum = 0xffb10032, offset = 0x00017f78, len =        32

'head' Table - Font Header
--------------------------
         'head' version:         1.0
         fontReversion:          1.0
         checkSumAdjustment:     0x69d6e98e
         magicNumber:            0x5f0f3cf5
         flags:                  0x0003
         unitsPerEm:             1000
         created:                0x00000000d5420807
         modified:               0x00000000d5420807
         xMin:                   -168
         yMin:                   -281
         xMax:                   1000
         yMax:                   1053
         macStyle bits:          0x0000
         lowestRecPPEM:          3
         fontDirectionHint:      2
         indexToLocFormat:       0
         glyphDataFormat:        0

For comparison Tinos ttf substitute for Times Roman:

$ ttfdump /usr/share/fonts/tinos/Tinos-Regular.ttf | awk "/'head'/,/^$/"
  12. 'head' - checksum = 0x0bd978fc, offset = 0x0000015c, len =        54
  13. 'hhea' - checksum = 0x19811ca6, offset = 0x00000194, len =        36
  14. 'hmtx' - checksum = 0xa4bce0e7, offset = 0x00000238, len =     13116
  15. 'kern' - checksum = 0xa39da9f5, offset = 0x0008d6f8, len =      5220
  16. 'loca' - checksum = 0x28e2bf88, offset = 0x0001a45c, len =     13120
  17. 'maxp' - checksum = 0x10d405bc, offset = 0x000001b8, len =        32
  18. 'name' - checksum = 0xc3ff0ad5, offset = 0x0008eb5c, len =      2052
  19. 'post' - checksum = 0xe841b7c5, offset = 0x0008f360, len =     34664
  20. 'prep' - checksum = 0xbd48485c, offset = 0x00019b40, len =      1550

'head' Table - Font Header
--------------------------
         'head' version:         1.0
         fontReversion:          1.20736
         checkSumAdjustment:     0x84b246c2
         magicNumber:            0x5f0f3cf5
         flags:                  0x001b
         unitsPerEm:             2048
         created:                0x00000000c844d0ce
         modified:               0x00000000d25f0c4c
         xMin:                   -1114
         yMin:                   -797
         xMax:                   5728
         yMax:                   2068
         macStyle bits:          0x0000
         lowestRecPPEM:          9
         fontDirectionHint:      2
         indexToLocFormat:       1
         glyphDataFormat:        0

[1] I do observe that the URW font descriptions shipped by groff include
     a special character called "space".  Syntactically, this would be
     accessed within a groff document via a special character escape
     sequence: `\[space]`.  I've never seen a document do this.  I admit
     that I don't have any idea why this is present or what its semantics
     are: I need a PostScript or PDF expert to tell me.[2]  It does occur
     to me that we might enhance afmtodit make of use of it as the
     default "spacewidth".

[2] Or I can self-help; I have copies of the _PostScript Language
     Reference Manual_ (3rd ed.) and a version of ISO 32000 lying around.

But Unifont uses 64 units/em, so 20-21?

$ ttfdump /usr/share/fonts/opentype/unifont/unifont.otf | awk '/head/,/^$/'
   5. 'head' - checksum = 0x5f163d75, offset = 0x000000bc, len =        54
   6. 'hhea' - checksum = 0x003adf37, offset = 0x000000f4, len =        36
   7. 'hmtx' - checksum = 0x3eb11f30, offset = 0x004a6b34, len =    228344
   8. 'maxp' - checksum = 0xdefe5000, offset = 0x00000118, len =         6
   9. 'name' - checksum = 0x5aec7895, offset = 0x00000184, len =      1000
  10. 'post' - checksum = 0x00030002, offset = 0x00000604, len =        32

'head' Table - Font Header
--------------------------
         'head' version:         1.0
         fontReversion:          0.0
         checkSumAdjustment:     0x3e8fcc29
         magicNumber:            0x5f0f3cf5
         flags:                  0x0003
         unitsPerEm:             64
         created:                0x0000000000000000
         modified:               0x0000000000000000
         xMin:                   -64
         yMin:                   -8
         xMax:                   64
         yMax:                   56
         macStyle bits:          0x0000
         lowestRecPPEM:          16
         fontDirectionHint:      2
         indexToLocFormat:       0
         glyphDataFormat:        0

--
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer     but when there is no more to cut
                                -- Antoine de Saint-Exupéry




[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux