UTF-8, UTF-16, UTF-32 & BOM



Q: Because most supplementary characters are uncommon, does that mean I can ignore them?

A: Most supplementary characters (expressed with surrogate pairs in UTF-16) are not too common. However, that does not mean that supplementary characters should be neglected. Among them are a number of individual characters that are very popular, as well as many sets important to East Asian procurement specifications. Among the notable supplementary characters are:

  • many popular emoji and emoticons

  • symbols used for interoperating with Wingdings and Webdings

  • numerous small sets of CJK characters important for procurement, including personal and place names

  • variation selectors used for all ideographic variation sequences

  • numerous minority scripts important for some user communities

  • some highly salient historic scripts, such as Egyptian hieroglyphics


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s