How to map /dev/vcsa charactel values to unicode code points

n8kl@xxxxxxxxxxxxx (Kitty Litter) · Sun, 11 Dec 2011 08:46:30 -0500

There are charts you should be able to find giving codepoints and the 
character they represent. If you know C I can give you code to demonstrate 
how to get the codepoint from a 2 3 or 4 byte utf-8 sequence. Basically when 
you see an extended ascii character you determine how many leading 1 bits 
there are. If there are 3 for example then sequence should be a 3-byte utf8. 
You then check the second and third byte to see that B15 is 1 and B14 is 
zero. Then you concatenate the bits and come up with the codepoint. Quite 
complicated!