contents
  1. other programs you need
  2. sources for other programs
  3. contents of fonty-rg
  4. preparations
  5. creating your own font definition file
    1. how a definition line looks like
    2. how many definitions are allowed
    3. how hexadecimal counting works
    4. what ranges we might meet
    5. rules for some HEX ranges
    6. contents of the definition file
    7. how font positions and limits are created
    8. summary
    9. practical hints
  6. (re)building fonts
  7. creating your userdefined acm kernel map
    1. how an acm map looks like
    2. 256 positions, simple definitions
    3. 512 positions, merged definitions
  8. using your font on console
    1. general procedure to use a font
    2. procedure for official combinations
    3. procedure for own mixtures
    4. switching between font contents
  9. names for files dealing with fonts
  10. how the whole chain works
  11. how your system looks like after booting
  12. error and other messages
    1. 9th column displayed incorrectly

other programs you need
the first three are usually already present on your system
comment
  1. a shell working as /bin/sh
  2. kbd or console-tools/console-data
  3. perl-5.001 or later (expected as /usr/bin/perl)
where to get the sources for other programs
if you are running a distribution, search at their ftp server
if you are compiling from source here the home
comment
contents of fonty-rg
  1. top directory
    build.sh (shell)
      wrapper for building the fonts
    choose (perl)
      select needed pictures for a font
    compact (perl)
      putting same pictures in the font together
    vga (perl)
      sorting characters according to pixels in the 8th column
    *.psf.gz
      precompiled compressed, ready-to-use console fonts
    comment
  2. subdirectory /charsets
    cz2t.sh (shell)
      remove colums from given file
    LatCyrGr (contains Latin, Cyrillic, Greek)
      table hex to Unicode
    *.txt (contains what the filename says)
      =xx U+xxxx OFFICIAL NAME: table hex to Unicode
    chavo.chars (contains special mixture East Europe, Esperanto)
      =xx U+xxxx OFFICIAL NAME: table hex to Unicode
    graphics
      U+xxxx: Unicode values for creating a box
  3. subdirectory /source
    *.sbf
      glyphs: pictures how each Unicode value looks
      contains the range of values the filename says

preparations
  1. unpack the fonty-rg package into your desired directory
    you will get a directory called fonty-rg
  2. unpack the fonty-1.0 package
    comment

creating your own font definition file
font defintion files are simple text files in /charsets
 
how a definition line looks like
 
=HEX  | U+number | # OFFICIAL NAME
------+----------+--------------------
=20   | U+0020   | # SPACE
=21   | U+0021   | # EXCLAMATION MARK
  1. =
    is only a mark helping to catch this number in scripts
  2. 20 and 21
    are the hexadecimal values as `man ascii` or `man iso8859_x` show
  3. U+
    the signal that this plus the following number is a Unicode value
  4. 0020 and 0021
    Unicode numbers, always together with U+ they make the Unicode values associated with a picture/glyph
  5. #
    comment sign for the following description
  6. SPACE and EXCLAMATION MARK
    are the official names of those characters
    Such comments help if you are dealing with characters for a foreign language or seldom used ones, so you do not know their meaning by heart.
how many definitions are allowed
Every line in such a text file which is not beginning with a # comment sign is a definition line and you can have either maximal 256 or maximal 512 of different definitions. Finally a defintion always ends up in a picture, the glyph. And the lines in your text file are only roughly an estimation whether you are within the limit or already exeeding it. If you want to define for example a font which contains iso8859-1 and also cyrillic letters, somewhere in your text file you will have these 2 lines:
  1 line for latin capital letter A (which is U+0041)
  1 line for cyrillic capital letter A (which is U+0410)
 
There are separate pictures for both of them, but when you see them you will realize that both pictures are the same and therefore your 2 lines only count as 1 defintion. Later, when you call the build.sh script to build your font, the compact script will take care, that all lines which have the same picture are put into one defintion; in our example it will do something like this
  instead of [U+0041] picture and [U+0410] picture
  do [U+0041] [U+0410] picture (both is capital A)
 
Usually you will not need more than 256 definitions. But in case you have put a lot into it and suddenly realize that you are beyond 256, don't worry, then it will make a 512-table. After 512, however, is the end with this kind of tables.
comment
 
how hexadecimal counting works
Now lets have a short look onto the hexadecimal values (the HEX in our examples above). They are counted like this:
 
00   ___10   ___20              from 00 to 09 and 0A to 0F (end of 0)
01  |   11  |   21              from 10 to 19 and 1A to 1F (end of 1)
..  |   ..  |   ..              from 20 to 29 and 2A to 2F (end of 2)
09  |   19  |   29
0A  |   1A  |   2A              you can best remember if you just speak
..  |   ..  |   ..              the numbers separately, like this:
0F  |   1F  |   2F              one E, one F, two zero, two one ....
|___|   |___|   |__ and so on

what ranges we might meet
Counting down in this way we can now split the long row into several parts which have special meanings/purposes, and doing this we get ranges. (Those ranges make it just a bit easier if we want to speak about a certain number of values which are treated equally). Ranges which are quite famous are these:
 
00  # NUL             --+ control characters (essential terminal
..    ..                | controlling characters like bell, carriage
1F  # UNIT SEPARATOR  --+ return, end of transmission ...)
20  # SPACE           --+
..    ..                | ascii characters
7F  # DELETE          --+
80  # blink           --+ control characters (additional terminal
..    ..                | controlling characters like bold, underline,
9F  #                 --+ reverse ...)
A0  # NO-BREAK SPACE  --+
..    ..                | iso8859 characters
FF  # subset dependend--+

rules for some HEX ranges
For ascii and parts of iso8859 range the Unicode number is equal to the HEX number, just put two zeros in front of it, so your lines would look like this:
 
=20  U+0020 -+
=21  U+0021  |
  ...        | the ascii range is the same everywhere
=7E  U+007E  |
=7F  U+007F -+
=A0  U+00A0 -+
=A1  U+00A1  |
  ... no     | in the iso range it does not always continue like this;
             | here the Unicode number depends on what the subset says
             | it should look like. For example
=A4  U+00A4  # CURRENCY  in iso8859-1
=A4  U+20AC  # EURO SIGN in iso8859-15
               Unicode values for a subset can for example be
               looked up in the other .txt files in /charsets
=00  U+0000 -+ the range of essential controls
 ...         | 
=1E  U+001E  | 
=1F  U+001F -+ 

comment =80 U+ -+ the range of additional controls ... | =9F U+ -+

contents of the definition file
Your font file can keep whatever pictures (glyphs) you like to have produced; but please remember that people outside are mostly using official character maps like iso8859-1 or iso8859-15. And if you go and mix the characters of different sets those outside will see only strange rubbish. You can for example only replace the currency sign by the euro sign in iso8859-1 and make your myfont.psf from that. If you now create a text containing "1/4 euro" and send it off to other people, those who use the -1 subset will see "1/4 currency" and those with the -15 subset see "capital OE ligature euro". These are called conflicting characters, because the same HEX value has a different Unicode value (and therefore a different picture) in another iso8859 subset.
But you can of course have both full character sets in your font with lines for all conflicting characters. In our example here it means, you first write all defintions for the -1 subset and then write all definitions for the conflicting characters of the -15 subset. (in that case you would already see when creating your text that the combination "1/4 euro" is not possible and you would end up in "0.25 cent").
 
how font positions and limits are created
Now uncompress the chavo.psf.gz font in the top directory of fonty-rg; this is a nearly completely filled up font. Then extract the builtin Unicode character table with this command:
  psfgettable chavo.psf chavo.builtin.table
 
Open the chavo.builtin.table in your editor. Note, it has 259 lines, the first 3 lines are comments so they do not count: 259-3=256 lines, yes this is the first limit. Now scroll down with your eyes on the first column and you will see, that this follows the counting you already learned - but here we have in front of it 0x0.
We need the 0x0 in front here because, if this would be a simple hexadecimal notation with only 0x in front, there would be a definite end at 0xFF (F is the last one we have) and we would never be able to make a font with more than 256 definitions. But the next limit was 512, and such a font going beyond 256 definitions would continue now like this: 0x0FF and then 0x100, 0x101, 0x102 .. So the first 256 have 0 as first number, the second 256 have 1 as first number (the counting sheme is the same as we learned, it just has one number more in front).
 
This first column is also called the font position and it is needed to find the picture (glyph). If your program tells the screen driver, now spit out the word "error" you expect the screen driver does not need ages for searching "where the heck in this file might be the o, ah here, no this looks more like a zero, just hold on ..."
 
summary
With the lines which you write in your font definition text, you tell what Unicode value should be associated whith the HEX value of that character, and the comments behind it describe its official name (and if you know this character you also know how the glyph looks). Unicode values which turn out to have glyphs looking the same, will later be put together to make a single definition from your two defintion lines. So this single definition occupies only a single font position. You can either have a font with maximal 256 font positions or a font with maximal 512 font positions.
practical hints
  1. take an already existing defintion file as starting point
  2. copy that one to a file with your desired name
  3. at the bottom of your new definition file add additional lines
    you might consider inserting a # comment line with description
    this way it is easier to take it as template for new ones
  4. all iso8859 text files can be easily examined with diff
  5. definition lines which are identical must not be repeated
    you need for example the ascii range only once
    some of the general ones from iso range are also identical

(re)building the fonts
  1. change into the fonty-rg/ directory and execute
  2.   ./build.sh
    it might take a while until you see some messages
    you will get the new built fonts in this directory
    the are named <purpose>.psf.gz
    they will be written over the ones comming with this package

creating your userdefined acm kernel map
If your font mixes just all characters which you find nice looking and you want to display them besides eachother, you can not use an acm kernel map which only contains ranges of official characters. So you need to make your own acm map.
 
There are basically two ways what the kernel makes out of the bytes it receives from a program when looking up values in the acm map.
  1. it simply displays what the font has for that value
    this is called direct-to-font
    and it looks like: (for) 0x8F (display) 0x8F
    an example is ..consoletrans/trivial
  2. it displays the glyph of the Unicode value for that value
    this is called user-to-unicode, where user means program
    and it looks like: (for) 0x8f (display the) U+008f (glyph)
    examples are ..consoletrans/8859-x_to_uni.trans
We would not need all the Unicode numbers if we simply want the kernel to display direct-to-font, so this case is less interesting for us.
 
how an acm map looks like
As a general rule for our acm map we can use lines like these, which are the most flexible way of writing (as they allow comments):
0x000   U+fffd  -+     this is the default for "unknown character"
 ...             | for 256 positions
0x0ff   U+      -+
0x100   U+      -+
 ...             | for 512 positions
0x1ff   U+      -+
^----------------- internal value which a program sends to be displayed
  ^--------------- 3 digits for 512 positions but don't harm for 256
        ^--------- Unicode value for the glyph which is printed
256 positions, simple definitions
for an acm map which deals with 256 positions and only have definition lines which are not merged into one definition, you can do this
  1. take the text file you created in the /charsets directory
  2. replace all "=" signs with "0x0"
  3. delete the official name explanation
  4. fill up the missing counting values in the first column
    these might be the essential and additional control ranges
  5. give the first value the Unicode value for unknown character
    like this: 0x000 U+FFFD
    you always see this if a requested character is not in the font
  6. you can specify an alternate if the font does not have the glyph
    in the line in question just add a second Unicode value
    like this: 0x04a U+20AC U+004A
    means if you don't find the euro sign display the currency sign
  7. HEX values which you did not define must stand alone
    like this: 0x003
    means no picture if the program sends these bytes
    this is necessary to keep the order according to your font
  8. save it under a name related to your font like <your_font>.acm
    the .acm will reflect that this is an acm kernel mapping
    for kbd this will finally go to unimaps/
    for console-tools it finally goes to consoletrans/
512 positions, merged definitions
If you have up to 512 definitions which are all pointing to a different picture/glyph, you just continue according to what you did with the first 256 positions. So: go down the whole HEX counting, leave the Unicode value empty if you did not specify a glyph, and maybe add an alternate Unicode value to display if the glyph is not in the font.
 
If you have definition lines which will be merged into one line, this means that you want to alternately display two different character sets. And this again means you must change the acm kernel map first.
comment
using your font on console
for the first test you can keep the font in the fonty-rg directory
 
general procedure to use a font
This assumes you did not put the screen driver in utf8 mode
To get your own font working you might need the following components
  1. your font of course
    - with kbd: setfont <your_font>.psf.gz
    - with console-tools: consolechars <your_font>.psf.gz
  2. a screen driver translation map hex to Unicode to find the glyph
    only needed if the font does not have a builtin Unicode map
    - with kbd: setfont -u <fitting>.trans
    - with console-tools: consolechars -u <fitting>.sfm
  3. a userdefined acm kernel map for the (sub)set you want to use
    - with kbd: setfont -m <desired_set>.uni
    - with console-tools: consolechars -m <desired_set>.acm
  4. the command to switch to the userdefined kernel acm map
    general for defining G0 to keep it: echo -e "\033(K"
    general for defining G1 to keep it: echo -e "\033)K"
    - kbd: not needed for G0 (included in "-m")
    comment
    - console-tools: not needed for G0, G1 defining with --g1

procedure for official combinations
if your font contains a combination of official character sets
like chavo combines the officials latin1,2,3 and koi8-r
  1. - load your font: consolechars or setfont <your_font>.psf.gz
  2. fonty-rg fonts have a hex to Unicode screen driver map builtin
    means you normally do not need an external screen driver map
    with kbd: -u ..consoletrans/<fitting>.trans
    with console-tools: -u ..consoletrans/<fitting>.sfm
  3. - load the momentarily desired userdefined acm kernel map
    - kbd: setfont -m ..unimaps/<desired>.uni
    - console-tools: consolechars -m ..consoletrans/<desired>.acm
    for example to get the koi8-r from the chavo font: koi8-r.uni|.acm
    Notes:
    Some kbd .uni maps like the iso0x.uni do not work, to get this work correctly you can download console-data and copy the .acm files from /consoletrans to the share/kbd/unimaps/ directory.
    This is unique for all ttyX terminals of the system which means you can't activate another userdefined map at the same time
  4. kbd and console-tools include the G0 defining when using "-m .."
    so G0 defining in an extra step is not needed
    console-tools have an option "--g1" to define G1 instead of G0

procedure for own mixtures
if your font is a mixture not corresponding to official sets
like exchange in iso8859-1 only the currency by euro
  1. - load your font: consolechars or setfont <your_font>.psf.gz
  2. fonty-rg fonts have a hex to Unicode screen driver map builtin
    means you normally do not need an external screen driver map
    with kbd: -u ..consoletrans/<fitting>.trans
    with console-tools: - ..consoletrans/<fitting>.sfm
  3. then you need to load your special extracted acm kernel map
    - kbd: setfont -m <your_font>.uni
    - console-tools: consolechars -m <your_font>.acm
  4. kbd and console-tools include the G0 defining when using "-m .."
    so G0 defining in an extra step is not needed

switching between font contents
You might have a font which contains characters for several different character sets, like the chavo font is able to display 4 character sets. With the command for your userdefined acm kernel map you already said which of those character sets should be used in the beginning. Now we see how to get our font display another character set.
 
In all we have 4 acm kernel maps which we can switch to
and 2 (kind of) variables which hold the defined acm kernel map.
Those 2 variables are G0 and G1 and their initial value is predefined, but that is just for convenience and we can always define them new.
 
  1. predefined value for G0: latin1
    defining G0 variable to hold a special acm map
    G0 to ISO latin1  with `echo -en "\033(B"`
    G0 to IBM PC 743  with `echo -en "\033(U"`
    G0 to DEC VT100   with `echo -en "\033(0"`
    G0 to userdefined with `echo -en "\033(K"`
  2. predefined value for G1: DEC VT100
    defining G1 variable to hold a special acm map
    G1 to ISO latin1  with `echo -en "\033)B"`
    G1 to IBM PC 743  with `echo -en "\033)U"`
    G1 to DEC VT100   with `echo -en "\033)0"`
    G1 to userdefined with `echo -en "\033)K"`

A terminal always starts with the predifined G0 value, if we don't like that value we define the variable to hold another acm map. Once our two G0 and G1 values are ok, we can switch between them with
 
  1. switch to G0 with key press: CTRL+O (ctrl and capital O together)
  2. switch to G1 with key press: CTRL+N (ctrl and capital N together)


final system adjustments
if your tests succeeded and you want to keep your versions
  1. separating them from your systems fonts you might use /usr/local
  2. make directories corresponding to your systems shared ones
  3. copy your new font to the consolefonts/ directory
    you might consider to use the .psf and .psfu naming sheme
    it is easier to see from the fonts name whether a table is builtin
    if you stripped out some builtin hex to Unicode tables
  4. copy your new translation maps to consoletrans/ and/or unimaps/
    you might consider to use the .acm and .sfm naming sheme
    it is easier to see from the maps name when it is used

if you decide to work with fonty-rg more often
  1. copy fonty-rgs scripts to bin/
  2. if not yet present add this directory to your PATH variable
  3. in share/ make a directory called fonty-rg
  4. copy fonty-rgs directory /source to share/fonty-rg/
  5. if you like do alike for the /charset directory

names for files dealing with fonts
how the whole chain works

kernel = lot of other things + console driver
console driver = keyboard driver + screen driver
 
  1. fingers press keys
  2. keyboard hardware sends scancodes to kernel
    [programs for scancodes: getkeycodes and setkeycodes]

  3.  
  4. kernel looks up keycode for the scancode in a table
  5. kernel sends keycode to keyboard driver
  6. keyboard driver looks up character for keycode in keytable
    [programs for keytables: dumpkeys and loadkeys]
    [programs for special keys: setmetamode]

  7.  
  8. keyboard driver is in one of 4 modes
    to send the characters to programs
    1. raw mode sends scancode
       for programs with an own keyboard driver (X11)
    2. keycode mode sends keycode
       for unknown purpose
    3. ascii mode sends character as 8-bit encoding
       (only 256 available)
    4. utf8 mode sends character as prefixed 8-bit encoding
       which makes multi-bytes
    comment
    [programs for keyboard (driver) mode: kbd_mode and showkey]

  9.  
  10. keyboard driver sends characters to program
  11. program works until result
  12. program sends characters to display to screen driver

  13.  
  14. screen driver is in one of 2 modes
    to receive characters from programs
    1. utf mode interpretes received bytes as utf8 sequence
       converts it into (UCS-2) 16-bit sequences
    comment
       looks up the glyph to display in the sfm screen font map
    2. byte mode interpretes received bytes as byte sequences
       looks up the bytes in the acm application charset map
       converts it into utf8 sequences, than into 16-bit sequences
       than looks up the glyph in the sfm screen font map
    [program for screen driver mode: vt-is-UTF8 (not in kbd package)]
    [switch to utf mode  with `echo -en "\033%G"`]
    [switch to utf mode  with `echo "\x1b%G"`]
    [switch to byte mode with `echo -en "\033%@"`]
    [switch to byte mode with `echo "\x1b%@"`]
    [programs for Unicode tables in fonts: psf{get,add,strip}table]
    comment
     
    acm application charset map is one of 4 maps,
    3 of them built into kernel (also called console maps)
    1. default IBM codepage 437 character set
       for i386 other architectures (also called PC code)
    comment
    2. DEC VT100 character set
    3. ISO latin1 character set
    4. user definable which is at boot time straight-to-font
     
    U+FFFD mostly font position 0 is the replacement character
    displayed if a character is not found in the sfm screen font map
    control ranges with Unicode values from U+F000 to U+F1FF
    (straight-to-font range) directly display what the font has
    comment
    [acm maps are in /usr/src/linux/drivers/char/console.c]
    [program for acm application charset map: none]
     
    a terminal can switch between two modes with G0 and G1
    1. G0 is by default ISO latin1
    2. G1 is by default DEC VT100
    comment
    [switch to G0 with ]
    [switch to G1 with ]
       example: on tty1=cp437 builtin with G0 switch
       example: on tty1=vt100 builtin with G1 switch
       example: on tty2=iso01 builtin with G0 switch
       example: on tty2=iso02 user with G1 switch
    but on all tty's there can only be one user-defined at the time
       example: on tty3=myown user with G0 impossible
    [adjust G0 to ISO latin1  with `echo -en '\033(B'`]
    [adjust G0 to IBM PC 743  with `echo -en '\033(U'`]
    [adjust G0 to DEC VT100   with `echo -en '\033(0'`]
    [adjust G0 to userdefined with `echo -en '\033(K'`]
    [adjust G1 to ISO latin1  with `echo -en '\033)B'`]
    [adjust G1 to IBM PC 743  with `echo -en '\033)U'`]
    [adjust G1 to DEC VT100   with `echo -en '\033)0'`]
    [adjust G1 to userdefined with `echo -en '\033)K'`]
    comment
     
  15. screen driver interpretes received character due to mode it is in
  16. screen driver converts character to USC-2 16-bit
  17. screen driver looks up glyph for the character in font map
  18. screen driver prints the glyph to the screen

how your system looks like after booting
If you start your linux system and do not run special initscripts which change settings your system will have this status:
 
  1. your keymap is the US keymap (qwerty/defkeymap)
    an initscript uses loadkeys to change the map for your keyboard
  2. the keyboard driver is in the default ASCII mode
    normally no initscript uses kbd_mode to change to utf8 mode
  3. so all your programs receive 8-bit ascii characters
  4. the screen driver is in the default byte mode
    normally no initscript uses the %@ echo to change to utf8 mode
  5. so all characters from programs will be treated as byte sequences
  6. the acm application character map is the default G0 ISO latin1
    normally no initscript changes this
  7. so the cp437.uni map will be used as base for transforming
    comment
  8. the console font is the default8x16.psfu.gz in /consolefonts/
    maybe an initscript uses setfont | consolechars to change the font


shared directories and what they contain
all those are usually in /usr/share/ (formerly in /usr/lib/)
subdirectories are up to distribution/install options
  1. consolefonts/
    used by console-tools and kbd
    with command "consolechars ..." or "setfont ..."
    contains compiled fonts in the linux default psf format
  2. consoletrans/
    used by console-tools and kbd
    contains mapping tables
    1. console-tools
      with command "consolechars -m ..."
      contains acm kernel mapping tables as *.acm or *.trans
      with command "consolechars -u ..."
      contains font mapping tables as *.sfm
    2. kbd
      with the command "setfont -m ..."
      contains acm kernel mapping tables as *.trans or without
  3. unimaps/
    used by kbd
    with command "setfont -u ..." or "loadunimap"
    contains non-builtin font mapping tables as *.uni or without

comment
error and other messages
These are some messages which you might see and what to make of them.
 
9th column displayed incorrectly
- occurs during building a font
- message sent by vga script
 
Open one of the .sbf files containing the glyphs and look at the lines which show a picture of a character. You will count 16 lines for the height and 9 dots for the width. We deal with a font here which is described as 8x16 font, with 8 referring to the width and 16 to the height. So the 8th and 9th column of the width are the ones of interest here.
 
The video memory has space for 8 pixels (8 pixels = 1 byte). The normal VGA hardware has space for 9 pixels and this one is responsible how it will look like on the screen if all those pictures are finally put after eachother. You would for example expect letters to be separated by a little space so they are readable, but at the same time expect a box you create to have the lines as one piece and not interrupted by little spaces.
 
So from the 8 pixels of the video memory the VGA driver has to make 9 pixels to bring everything correctly onto screen by either adding the 9th pixels as blanks or by repeating the 8th pixels. The "blankings" are in the graphics area (letter) and the "repeatings" are in the pseudographics area (box), just the other way round you would assume from the word "graphic". BUT the pseudographics area is hardweired into your graphic card, and if we have a picture which would actually need to be here but is not, it will be treated like one of the graphic area and instead of repeat it will blank.
 
Now the pictures in the glyph files have all 9 columns, they are allowed to have 9 although the video memory will only keep 8. So those pictures should give us the idea how the final result might look like. And the vga script will look at their 8th and 9th column to find out whether it is a letter, so it can put into the graphics area (remember the pseudographics was hardcoded).
 
Knowing all this now, look at these lines and note how the last one will give a different result on screen than the glyph will make you believe (simply because 8th and 9th column are not equal. So having unequal things in column 8 and 9 in the glyph might be a reason for such a message.
 
glyph says   video memory  VGA does  on screen    description
        +-+          +-+                     +-+
00...00.|.|  00...00.| |   . append  00...00.|.|  graphic (letter)
        | |          | |                     | |
...00000|0|  ...00000| |   0 repeat  ...00000|0|  pseudographic (box)
        | |          | |                     | |
.......0|0|  ........| |   0 repeat  .......0|0|  pseudographics
.......0|.|  ........| |   0 repeat  .......0|0|  with different
........|0|  ........| |   . repeat  ........|.|  8th/9th column
        +-+          +-+                     +-+
         ^----what----^---happens with the----^---9th pixel
       ^--if the 8th pixel is treated according to this area--^

comment
If you run into this with the koi8-r and koi8-u fonts, the reason is a different one. With no locale specified running build.sh will announce 234 characters in the font for both files; but 8r has 7 of them and 8u has 3 of them displayed incorrectly. So the difference between 8r and 8u must eliminate 4 offending characters.
 
Finding the difference is done simply with diff-ing their text source files. Generally spoken 8r has drawings and 8u replaces all drawings with letters. Now looking into the glyph files for all the drawings show that 4 of them will be recognized as pseudographics (8/9 being equally zero like box) and the other 4 of them will be recognized as graphics (8/9 being equally dot, so letter).
 
As we only deal with column 8/9 here, we can also say, 4 replace graphic with graphic, so there is no change; and 4 replace pseudographic with graphic and the offending ones are gone. So pseudographics (these are the drawings which continue straight to next one) will be displayed incorrectly (like normal letters).
 
Now we have a look at the definition lines in whole and we see that all Unicode values are connected to the HEX values of the characters which are somehow between =Cx and =Dx. These belong to the iso range and are assumed to be letters and no drawings.