Sorting Based on Locale
You can sort data in ProcScript using several ProcScript statements and functions. The default sort order is determined by the data type of the data and the values of NLS settings.
Default Behavior and Overrides
When sorting entities and lists using sort, sort/list, $sortlist, and $sortlistid, the sort order is determined by the data type of the data.
By default, Uniface uses the data type of the specified sort field. Thus, a simple sort instruction (with no sort options), will sort a Numeric field in numerical order and a Date field according to its internal data format.
For String fields and lists, the sort order is also influenced by the current values of $nlssortorder and $nlslocale. These can be set in the assignment file or in ProcScript.
If $nlslocale is set to
anything except classic
, the locale-based rules are applied when sorting strings.
Sorting of other data types remains the same. The value of $nlslocale can be
overridden by $nlssortorder.
It is possible to override the default data type or locale by specifying a value for the Type argument of the ProcScript sort commands. Thus, the sort order is determined by:
- The value of the Type
argument. If this argument is specified, it overrides the default data type or NLS setting.
Setting Type to
classic
,CaseSensitive
orCaseInsensitive
, ornlslocale
forces the data to be treated as the String data type, and sorted according to the argument. - If the Type argument is not
specified, the data is sorted according to the data type. Numeric and float data gets sorted
numerically, date and time data according to its internal data format.
String data is sorted according to the locale, if specified, or as a binary sort if no locale is defined. The locale for sorting is determined as follows:
- The value of $nlssortorder, if set.
- If $nlssortorder is not set, the value of $nlslocale is used.
- If $nlslocale is set to
a specific locale (language and country), locale-based rules are applied during case conversion.
If $nlslocale is not set, or is set to
classic
, a binary sort order is used for strings.
Binary vs. Locale-Based Sorting
Binary sorting is often unsatisfactory because it sorts number as characters (10 comes before 9), and because it does not take language into account. The combination of language and region, known as locale, can have a significant impact on the way strings are sorted and searched. For example:
- Non-alphabetic languages may be sorted phonetically or by the character appearance.
- Languages that share the same alphabet may
sort characters differently. In English,
y
is sorted between x and z, whereas in Lithuanian, it is sorted betweenI
andk
. - Combinations of letters can be treated as if
they were one letter, or one letter can be treated as if it were two letters. For example, in
Spanish
ch
is treated as a single letter, and sorted betweenc
andd
. In Germanä
is treated asae
. - Accented letters can be treated as variants of
an unaccented letter, or as completely distinct letters, and therefore sorted differently. For
example, in Danish,
Å
sorts just afterZ
. - In French, letters with accents at the end of
the string are sorted ahead of accents in the beginning of the string. For example,
côte
sorts beforecoté
. - Unaccented letters may be distinct in one
language can be indistinct in another. For example, in English
v
andw
are different letters, in Swedish, they are variants of the same letter. - Even in the same language, different
applications might require different sorting orders. For example, in German dictionaries,
öf
comes beforeof
; in phone books it is the exact opposite. - Sometimes lowercase letters sort before uppercase letters, and sometimes the opposite it true.