[PATCH] Fix EPG for UPC direct

m.kapoun at kapik.net (m.kapoun@xxxxxxxxx) · Tue Jan 2 23:44:33 2007

Hi,
I converted iso6937 to  iso8859-1 by small patch.
It was simplest way, because I only deleted non-spacing characters
(diacritical marks).
It was fast solution form me (strange letters in EPG and bad recordings file
names ), but it is not
good because EPG doesn't  contents correct  Czech, .... , ... letters.

How it working in Czech rep:
All  DVB-T TVs using  'table 00 - Latin alphabet'.  This table is a superset
of ISO/IEC 6937. I talked with persons from Cesk? radiokomunice
(broadcaster) and CzechTV, and wonted explain them that iso8859-2 is better
choice :-(.  Their opinion is that "table 00" is more complex, they can
display non-czech characters without problems.

I am afraid that most  Europe DVB using  'table 00 - Latin alphabet'. , but
they don't use characters
0xC0 to 0xCF. (non-spacing characters: the character is printed together
with next character. Like mechanical type writer. It is crazy).

I  haven't any idea how it solve correctly, and how select default character
set for VDR.

A few lines from ETSI EN 300 468 V1.7.1 (2005-12)

Annex A.2
If the first byte of the text field has a value in the range "0x20" to
"0xFF" then this and all subsequent bytes in the text
item are coded using the default character coding table (table 00 - Latin
alphabet) of figure A.1.

Notes for picture A.1

Figure A.1: Character code table 00 - Latin alphabet
NOTE 1: The SPACE character is located in position 20h of the code table.
NOTE 2: NBSP = no-break space.
NOTE 3: SHY = soft hyphen.
NOTE 4: This table is a superset of ISO/IEC 6937 [24] with addition of the
Euro symbol.
NOTE 5: All characters in column C are non-spacing characters (diacritical
marks).

Milos

----- Original Message ----- 
From: "Klaus Schmidinger" <Klaus.Schmidinger@xxxxxxxxxx>
To: <vdr@xxxxxxxxxxx>
Sent: Sunday, December 10, 2006 11:51 AM
Subject: Re: [PATCH] Fix EPG for UPC direct

> Thiemo Gehrke wrote:
>> UPC is a provider for middle european countries (Czechia, Hungary and 
>> Poland). They use iso6937-2 for encoding their EPG data so this looks 
>> quite strange in the vdr.
>> The applied patch does a "remapping" to iso8859-2 so that characters are 
>> displayed correct. (Currently only tested with Czech and Hungarian, but 
>> should also work for Polish)
>>
>> While testing this with the help of an hungarian user, i also found out 
>> that the the codepage for Hungary must be 8859-2, not -1.
>>
>> The patch is work by Helmut Auer.
>>
>> cheers,
>> Tim
>>
>>
>> ------------------------------------------------------------------------
>>
>> --- vdr-1.4.4-vanilla/epg.c 2006-10-28 11:12:42.000000000 +0200
>> +++ vdr-1.4/epg.c 2006-11-28 12:39:33.000000000 +0100
>> @@ -18,6 +18,165 @@
>>
>>  #define RUNNINGSTATUSTIMEOUT 30 // seconds before the running status is 
>> considered unknown
>>
>> +// UPC Direct / HBO strange two-character encoding. 0xC2 means acute, 
>> 0xCF caron.
>> +// many thanks to the czechs who helped me while solving this.
>> ...
>
> How is their encoding coded in the first byte of the texts?
> I can't seem to find an encoding for iso6937-2 in ETSI EN 300 46, section 
> A.2.
>
> Also, what happens if you run such a string through iconv() to convert it
> from iso6937-2 to iso8859-2 or UTF-8?
>
> I'm asking because this is how VDR will handle character sets in the next
> version.
>
> Klaus
>
> _______________________________________________
> vdr mailing list
> vdr@xxxxxxxxxxx
> http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: A_1.png
Type: image/png
Size: 55393 bytes
Desc: not available
Url : http://www.linuxtv.org/pipermail/vdr/attachments/20070102/9406dae3/A_1-0001.png