class DoubleByteEncoding

Library: Encodings
Package: Encodings
Header: Poco/DoubleByteEncoding.h


This abstract class is a base class for various double-byte character set (DBCS) encodings.

Double-byte encodings are variants of multi-byte encodings where (Unicode) each code point is represented by one or two bytes. Unicode code points are restricted to the Basic Multilingual Plane.

Subclasses must provide encoding names, a static CharacterMap, as well as static Mapping and reverse Mapping tables, and provide these to the DoubleByteEncoding constructor.


Direct Base Classes: TextEncoding

All Base Classes: TextEncoding

Known Derived Classes: MacChineseSimpEncoding, MacCentralEurRomanEncoding, MacJapaneseEncoding, MacCyrillicEncoding, MacChineseTradEncoding, MacKoreanEncoding, Windows1253Encoding, MacRomanEncoding, Windows1255Encoding, Windows1254Encoding, Windows1257Encoding, Windows1256Encoding, Windows1258Encoding, Windows874Encoding, Windows932Encoding, Windows949Encoding, Windows936Encoding, Windows950Encoding

Member Summary

Member Functions: canonicalName, characterMap, convert, isA, map, queryConvert, reverseMap, sequenceLength

Inherited Functions: add, byName, canonicalName, characterMap, convert, find, global, isA, manager, queryConvert, remove, sequenceLength

Nested Classes

struct Mapping



DoubleByteEncoding protected

    const char * * names,
    const TextEncoding::CharacterMap & charMap,
    const Mapping mappingTable[],
    std::size_t mappingTableSize,
    const Mapping reverseMappingTable[],
    std::size_t reverseMappingTableSize

Creates a DoubleByteEncoding using the given mapping and reverse-mapping tables.

names must be a static array declared in the derived class, containing the names of this encoding, declared as:

const char* MyEncoding::_names[] =

The first entry in names must be the canonical name.

charMap must be a static CharacterMap giving information about double-byte character sequences.

For each mappingTable item, from must be a value in range 0x0100 to representing the first character in the sequence and the lower byte representing the second character in the sequence.

For each reverseMappingTable item, from must be Unicode code point from the Basic Multilingual Plane, and to is a one-byte or two-byte sequence. As with mappingTable, a one-byte sequence is in range 0x00 to 0xFF, and a two-byte sequence is in range 0x0100 to 0xFFFF.

Unicode code points are restricted to the Basic Multilingual Plane (code points 0x0000 to 0xFFFF).

Items in both tables must be sorted by from, in ascending order.


~DoubleByteEncoding protected virtual


Destroys the DoubleByteEncoding.

Member Functions

canonicalName virtual

const char * canonicalName() const;

characterMap virtual

const CharacterMap & characterMap() const;

convert virtual

int convert(
    const unsigned char * bytes
) const;

convert virtual

int convert(
    int ch,
    unsigned char * bytes,
    int length
) const;

isA virtual

bool isA(
    const std::string & encodingName
) const;

queryConvert virtual

int queryConvert(
    const unsigned char * bytes,
    int length
) const;

sequenceLength virtual

int sequenceLength(
    const unsigned char * bytes,
    int length
) const;

map protected

int map(
    Poco::UInt16 encoded
) const;

Maps a double-byte encoded character to its Unicode code point.

Returns the Unicode code point, or -1 if the encoded character is bad and cannot be mapped.

reverseMap protected

int reverseMap(
    int cp
) const;

Maps a Unicode code point to its double-byte representation.

Returns -1 if the code point cannot be mapped, otherwise a value in range 0 to 0xFF for single-byte mappings, or 0x0100 to 0xFFFF for double-byte mappings.