Uniface Support for Unicode and Other Character Sets

A Uniface installation simultaneously supports at least two character sets—Unicode and the installed character set, but it can actually support more character sets.

Unicode

Characters are encoded in Unicode when displayed in a GUI platform or when passed to a Unicode-based component, such as a web service.

Mapping between Unicode (UTF) formats is handled by algorithms.

More specifically:

  • In Microsoft Windows (GUI), the contents of String fields are displayed and entered by mapping between UTF-8 and UTF-16.

    For fields with a C packing code, the character set of the field's entity is used for the field validation.

  • Uniface communicates with Unicode-based components by mapping between UTF-8 and any other UTF formats. This is used for XML components, COM call-in and call-out, Java call-in, and for fileload and fileddump when a Unicode format is specified.
  • Data in string fields with W packing code are stored in and retrieved from databases by mapping between UTF-8 and the UTF format used in the database.

Installed Character Set

Characters are encoded in the installed character set when displayed in a character-mode user interface (CHUI), or when passed to a non-Unicode-based component, such as C.

  • In the character-based (CHUI) environment, the contents of String fields are handled by mapping between UTF-8 and the character set specified in $SYS_CHARSET.
  • Uniface communicates with non-Unicode based components by mapping between UTF-8 and the character set specified in $SYS_CHARSET.
  • Data in String fields with C packing code are stored in and retrieved from databases by mapping between UTF-8 and the character set specified in $DEF_CHARSET or the entity character set if it is specified.
  • If $META_IN_TRX=0 is specified in the .asn file, data in String fields with U* packing code are stored in databases as XML format. Therefore, such fields can be used to hold Unicode characters. If $META_IN_TRX=1, they are stored in TRX format.

    Note:  The actual mapping between UTF-8 and other character sets is via Uniface's meta character set. Namely: $SYS_CHARSET/$DEF_CHARSET↔meta character set ↔UTF-8, unless $DEF_CHARSET=UTF-8.

Other Character Sets

A Uniface installation can simultaneously support many character sets.In addition to Unicode and the installed character set, you can:

  • Assign a character set to $DEF_CHARSET that is different from $SYS_CHARSET.
  • Select another different character set for an entity via the Character Set property of the entity interface. Normally, you do not do this unless there is a good reason.

Related Topics