The Teletext Converter is a Java library that can convert teletext files in EP1, ETP and Texas-XML formats. It is also possible to render teletext files as an image (PNG).
The basis is the “Fernsehtext-Spezifikation (Datum: Juni 1986)” (Technische Richtlinie 8R4), which is available from the Institut für Rundfunktechnik. It roughly corresponds to the "Enhanced Teletext specification", Presentation-Level 1.
The Teletext Converter supports the national alphabet variant “German” of level 1 of the television text specification.
Requirements
The Teletext Converter requires Java 8. Dependencies to other libraries are defined in the pom.xml
.
Command-line utility
The library contains a main class with which it is possible to easily convert files:
java -jar teletext-converter-1.0.0-executable.jar test.etp test.png
Reader and Writer
For each supported format there is a reader and a writer to read or write files in the corresponding format. A file must be transferred either as a byte array or as a string. Here's the list of readers and writers:
Format | Reader | Writer |
---|---|---|
EP1 | com.subshell.teletext.ep1.EP1Reader | com.subshell.teletext.ep1.EP1Writer |
ETP | com.subshell.teletext.etp.ETPReader | com.subshell.teletext.etp.ETPWriter |
Texas-XML | com.subshell.teletext.xml.TexasXmlReader | com.subshell.teletext.xml.TexasXmlWriter |
PNG | com.subshell.teletext.png.PngWriter |
Converting a file from Texas-XML to ETP could look like this:
byte[] xmlData = FileUtils.readFileToByteArray(new File("teletext.xml"));
TeletextDocument doc = new TexasXmlReader.read(xmlData);
byte[] etpData = new ETPWriter.toByteArray(doc);
FileUtils.writeByteArrayToFile(new File("teletext.etp"), etpData);
TeletextDocument and TeletextPage
Teletext files are internally represented as TeletextDocument
for the conversion between the different formats. A TeletextDocument
corresponds to a page in Teletext, including subpages. It can have a name and a slot number, but these are only used for exporting to Texas XML. Mainly it contains a list of TeletextPages
.
The TeletextPage
represents a normal or sub-page in teletext. It consists of a list of a maximum of 23 lines, and each line of a maximum of 40 characters. The characters correspond to the encoding used in the teletext specification, e.g. the letter “a” has the value 61, the control character “alpha white” has the value 7. The encoding of the letters and punctuation marks thus essentially corresponds to that of ISO-8859-1, with the exception of umlauts and some special characters.
For the control characters there is a list of constants in the ControlCharacter
interface. There are constants in the Alphanumeric
interface for the umlauts and other characters that deviate from the ISO coding.
Creating and exporting a TeletextDocument
directly in Java could look like this:
TeletextDocument doc = new TeletextDocument();
TeletextPage page = new TeletextPage();
doc.addPage(page);
// Using constants.
String line1 = ControlCharacter.ALPHA_CYAN + "Neues vom B" + Alphanumeric.SMALL_UMLAUT_U + "chermarkt";
page.addLine(line1);
// Using codepoints.
String line2 = "\u0006Neues vom B\u007Dchermarkt";
page.addLine(line2);
for (int i = 2; i < 23; i++) {
page.addLine(Integer.toString(i));
}
FileUtils.writeByteArrayToFile(new File("test.xml"), new TexasXmlWriter().toByteArray(doc));
FileUtils.writeByteArrayToFile(new File("test.png"), new PngWriter().toByteArrays(doc).get(0));
The resulting PNG looks like this:
Texas-XML
For exchanging teletext files, subshell has developed the Texas XML format for use with the Texas editor or for the "Sophora Add-On Teletext". The Teletext Converter supports the subset of the Texas XML that is necessary for the display of teletext pages. There is an XML schema for this subset, divided into two files: slot.xsd
and tt.xsd
.
A Texas XML file contains a <slot>
element with a <page>
element per teletext (sub)page. The text and the control characters are specified within the page. The names of the control characters can be found in the table "Control Characters (in German)".
Each line must end with the <tt:br>
element. In Texas-XML, umlauts and special characters such as Ҥ
” can be entered directly. They are translated by the TexasXmlReader
into the corresponding characters of the teletext specification.
Example of a Texas XML file:
<?xml version="1.0" encoding="ISO-8859-1"?>
<slot xmlns="http://www.subshell.de/slot" xmlns:tt="http://www.subshell.de/tt" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.subshell.de/slot slot.xsd" number="379">
<page>
<tt:alpha_white /><tt:new_bg /><tt:graphic_blue /> 5 <tt:br />
<tt:alpha_white /><tt:new_bg /><tt:graphic_blue />thi,0<d <tt:br />
<tt:alpha_white /><tt:new_bg /><tt:graphic_blue />5kjp%7i <tt:alpha_blue />NDR FERNSEHEN <tt:br />
<tt:alpha_cyan /><tt:new_bg /><tt:graphic_blue /> % <tt:br />
<tt:alpha_white />Montag 20.01.14 22:45 - 23:15 Uhr<tt:br />
<tt:alpha_white />Donnerstag 23.01.14 02:30 - 03:00 Uhr<tt:br />
<tt:alpha_white />KULTURJOURNAL<tt:br />
<tt:alpha_white />Neues aus der Kulturszene und vom Bü-<tt:br />
<tt:alpha_white />chermarkt<tt:br />
<tt:br />
<tt:alpha_cyan />1. Von wegen "Sozialtourismus" - Die<tt:br />
<tt:alpha_cyan />Hamburger Ausstellung "Wanderarbeiter"<tt:br />
<tt:alpha_cyan />2. Nie gesehene Filme aus dem Ersten<tt:br />
<tt:alpha_cyan />Weltkrieg - Ein gigantisches Internet-<tt:br />
<tt:alpha_cyan />Erinnerungsprojekt<tt:br />
<tt:alpha_cyan />3. Anschlag auf das Oktoberfest - Der<tt:br />
<tt:alpha_cyan />Spielfilm "Der blinde Fleck" und die<tt:br />
<tt:alpha_cyan />Ignoranz des rechten Terrors<tt:br />
<tt:alpha_cyan />4. Retro-Pop mit Ufos - Julia Westlake<tt:br />
<tt:alpha_cyan />trifft Michy Reincke<tt:br />
<tt:alpha_cyan />5. Die Rasenden" von Karin Beier - Gro-<tt:br />
<tt:alpha_cyan />ße Premiere im Deutschen Schauspielhaus<tt:br />
<tt:alpha_cyan /><tt:new_bg /><tt:alpha_blue />300< TV-PROGRAMM HEUTE >301-305<tt:br />
</page>
</slot>
Control Characters (in German)
For each control character, the table gives the code as used in the TeletextPage
and the Teletext specification and the XML element name in the Texas XML. The XML elements are all in the XML namespace "http://www.subshell.de/tt
".
German Name | Code | Texas-XML element name |
---|---|---|
Alpha Schwarz | 0 | alpha_black |
Alpha Rot | 1 | alpha_red |
Alpha Grün | 2 | alpha_green |
Alpha Gelb | 3 | alpha_yellow |
Alpha Blau | 4 | alpha_blue |
Alpha Magenta | 5 | alpha_magenta |
Alpha Cyan | 6 | alpha_cyan |
Alpha Weiß | 7 | alpha_white |
Blinkende Wiedergabe | 8 | flash |
Ruhende Wiedergabe | 9 | still |
Ende Einblendfeld | 10 | endbox |
Anfang Einblendfeld | 11 | startbox |
Normale Höhe | 12 | normal_height |
Doppelte Höhe | 13 | double_height |
SO (nicht verwendet) | 14 | so |
S1 (nicht verwendet) | 15 | si |
Mosaik Schwarz | 16 | graphic_black |
Mosaik Rot | 17 | graphic_red |
Mosaik Grün | 18 | graphic_green |
Mosaik Gelb | 19 | graphic_yellow |
Mosaik Blau | 20 | graphic_blue |
Mosaik Magenta | 21 | graphic_magenta |
Mosaik Cyan | 22 | graphic_cyan |
Mosaik Weiß | 23 | graphic_white |
Verdeckte Wiedergabe | 24 | conceal_display |
Zusammenhängende Grafik | 25 | cont_graphics |
Gerasterte Grafik | 26 | sep_graphics |
ESC (nicht verwendet) | 27 | esc |
Schwarzer Hintergrund | 28 | black_bg |
Neuer Hintergrund | 29 | new_bg |
Überschreibende Grafik | 30 | hold_graphics |
Nichtüberschreibende Grafik | 31 | release_graphics |
Example:
<slot xmlns="http://www.subshell.de/slot" xmlns:tt="http://www.subshell.de/tt" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.subshell.de/slot slot.xsd" number="/Users/theess/git/sophora-tools/texas-xml-converter/src/test/resources/379">
<page>
<tt:alpha_white /><tt:new_bg /><tt:alpha_black />Schwarz auf Weiß<tt:br />
<tt:alpha_red />Rot auf Schwarz<tt:br />
</page>
</slot>