UTF-8, UTF-16, UTF-32 & BOM



Q: Because most supplementary characters are uncommon, does that mean I can ignore them?

A: Most supplementary characters (expressed with surrogate pairs in UTF-16) are not too common. However, that does not mean that supplementary characters should be neglected. Among them are a number of individual characters that are very popular, as well as many sets important to East Asian procurement specifications. Among the notable supplementary characters are:

  • many popular emoji and emoticons

  • symbols used for interoperating with Wingdings and Webdings

  • numerous small sets of CJK characters important for procurement, including personal and place names

  • variation selectors used for all ideographic variation sequences

  • numerous minority scripts important for some user communities

  • some highly salient historic scripts, such as Egyptian hieroglyphics


