ISO/IEC 14651:2025

Information technology - International string ordering and comparison - Method for comparing character strings and description of the common template tailorable ordering

OVERVIEW

This document defines a reference comparison method. This method is applicable to two or more character strings to determine their collating order in a sorted list. The method can be applied to strings containing characters from the full repertoire of ISO/IEC 10646. This method is also applicable to subsets of that repertoire to produce ordering results valid (after tailoring) for a given set of languages for each script. This method uses collation tables derived either from the Common Template Tables (CTT) referenced by this document or from one of their tailoring. The format of the Common Template Table is described using the Backus-Naur Form (BNF). The format is used normatively within this document.

This document also defines syntax elements to tailor these Common Template Tables used by the reference comparison method. Furthermore, it defines requirements for a declaration of the differences (delta) between a collation table and a given Common Template Table including the tailoring elements.

These Common Template Tables describe an order for all characters encoded in the current and past ISO/IEC 10646 editions, including amendments. They allow for a specification of a fully deterministic ordering. These tables enable the specification of a string ordering adapted to local ordering rules, without requiring an implementer to have knowledge of all the different scripts already encoded in the Universal Coded Character Set (UCS).

All these Common Template Tables have reference names which are related to a particular stage of development of the ISO/IEC 10646 Universal coded character set or a particular version of the Unicode Standard. These names and their relationship with ISO/IEC 10646 or the Unicode Standard repertoire are specified by an externally referenced document: Unicode Technical Standard, UTS #10, Unicode Collation Algorithm.

This document does not:

mandate a specific comparison method; any equivalent method giving the same results is acceptable;

mandate a specific format for describing or tailoring tables in a given implementation;

mandate specific symbols to be used by implementations;

mandate any specific internal format for intermediate keys used when comparing, nor for the table used. The use of numeric keys is not mandated either;

mandate a context-dependent ordering;

mandate any particular preparation of character strings prior to comparison.

NOTE 1 It is typical to do preparation of character strings prior to comparison even if it is not prescribed by this document (see Annex C).

NOTE 2 Annex D describes problems that gave way to this document with their anticipated solutions.

COMMENTS

-

PRODUCT DETAILS

Status Current
Edition 2025
No. of Pages 51
ICS Classification 35.040.10 Coding of character sets
Committee ISO/IEC JTC 1/SC 2
Available for Purchase For sale in Singapore only
Adoption ISO : 0