Poco

class UTF8Encoding

Library: Foundation
Package: Text
Header: Poco/UTF8Encoding.h

Description

UTF-8 text encoding, as defined in RFC 2279.

Inheritance

Direct Base Classes: TextEncoding

All Base Classes: TextEncoding

Member Summary

Member Functions: canonicalName, characterMap, convert, isA, isLegal, queryConvert, sequenceLength

Inherited Functions: add, byName, canonicalName, characterMap, convert, find, global, isA, manager, queryConvert, remove, sequenceLength

Constructors

UTF8Encoding

UTF8Encoding();

Destructor

~UTF8Encoding virtual

~UTF8Encoding();

Member Functions

canonicalName virtual

const char * canonicalName() const;

characterMap virtual

const CharacterMap & characterMap() const;

convert virtual

int convert(
    const unsigned char * bytes
) const;

convert virtual

int convert(
    int ch,
    unsigned char * bytes,
    int length
) const;

isA virtual

bool isA(
    const std::string & encodingName
) const;

isLegal static

static bool isLegal(
    const unsigned char * bytes,
    int length
);

Utility routine to tell whether a sequence of bytes is legal UTF-8. This must be called with the length pre-determined by the first byte. The sequence is illegal right away if there aren't enough bytes available. If presented with a length > 4, this function returns false. The Unicode definition of UTF-8 goes up to 4-byte sequences.

Adapted from ftp://ftp.unicode.org/Public/PROGRAMS/CVTUTF/ConvertUTF.c Copyright 2001-2004 Unicode, Inc.

queryConvert virtual

int queryConvert(
    const unsigned char * bytes,
    int length
) const;

sequenceLength virtual

int sequenceLength(
    const unsigned char * bytes,
    int length
) const;