File Information
Library: Encodings
Package: Encodings
Header: Poco/DoubleByteEncoding.h
Description
This abstract class is a base class for various double-byte character set (DBCS) encodings.
Double-byte encodings are variants of multi-byte encodings where (Unicode) each code point is represented by one or two bytes. Unicode code points are restricted to the Basic Multilingual Plane.
Subclasses must provide encoding names, a static CharacterMap, as well as static Mapping and reverse Mapping tables, and provide these to the DoubleByteEncoding constructor.
Inheritance
Direct Base Classes: TextEncoding
All Base Classes: TextEncoding
Known Derived Classes: Windows936Encoding, Windows1254Encoding, Windows932Encoding, MacJapaneseEncoding, MacChineseTradEncoding, MacCyrillicEncoding, MacRomanEncoding, Windows1257Encoding, Windows1258Encoding, Windows874Encoding, MacCentralEurRomanEncoding, Windows1253Encoding, Windows1256Encoding, Windows949Encoding, Windows950Encoding, MacChineseSimpEncoding, Windows1255Encoding, MacKoreanEncoding
Member Summary
Member Functions: canonicalName, characterMap, convert, isA, map, queryConvert, reverseMap, sequenceLength
Inherited Functions: add, byName, canonicalName, characterMap, convert, find, global, isA, manager, queryConvert, remove, sequenceLength
Nested Classes
struct Mapping
Constructors
DoubleByteEncoding
DoubleByteEncoding(
const char * * names,
const TextEncoding::CharacterMap & charMap,
const Mapping mappingTable[],
std::size_t mappingTableSize,
const Mapping reverseMappingTable[],
std::size_t reverseMappingTableSize
);
Creates a DoubleByteEncoding using the given mapping and reverse-mapping tables.
names must be a static array declared in the derived class, containing the names of this encoding, declared as:
const char* MyEncoding::_names[] = { "myencoding", "MyEncoding", NULL };
The first entry in names must be the canonical name.
charMap must be a static CharacterMap giving information about double-byte character sequences.
For each mappingTable item, from must be a value in range 0x0100 to representing the first character in the sequence and the lower byte representing the second character in the sequence.
For each reverseMappingTable item, from must be Unicode code point from the Basic Multilingual Plane, and to is a one-byte or two-byte sequence. As with mappingTable, a one-byte sequence is in range 0x00 to 0xFF, and a two-byte sequence is in range 0x0100 to 0xFFFF.
Unicode code points are restricted to the Basic Multilingual Plane (code points 0x0000 to 0xFFFF).
Items in both tables must be sorted by from, in ascending order.
Destructor
~DoubleByteEncoding
Destroys the DoubleByteEncoding.
Member Functions
canonicalName
const char * canonicalName() const;
See also: Poco::TextEncoding::canonicalName()
characterMap
const CharacterMap & characterMap() const;
See also: Poco::TextEncoding::characterMap()
convert
int convert(
const unsigned char * bytes
) const;
See also: Poco::TextEncoding::convert()
convert
int convert(
int ch,
unsigned char * bytes,
int length
) const;
See also: Poco::TextEncoding::convert()
isA
bool isA(
const std::string & encodingName
) const;
See also: Poco::TextEncoding::isA()
queryConvert
int queryConvert(
const unsigned char * bytes,
int length
) const;
See also: Poco::TextEncoding::queryConvert()
sequenceLength
int sequenceLength(
const unsigned char * bytes,
int length
) const;
See also: Poco::TextEncoding::sequenceLength()
map
int map(
Poco::UInt16 encoded
) const;
Maps a double-byte encoded character to its Unicode code point.
Returns the Unicode code point, or -1 if the encoded character is bad and cannot be mapped.
reverseMap
int reverseMap(
int cp
) const;
Maps a Unicode code point to its double-byte representation.
Returns -1 if the code point cannot be mapped, otherwise a value in range 0 to 0xFF for single-byte mappings, or 0x0100 to 0xFFFF for double-byte mappings.