UTF-8, UTF-16, UTF-32 & BOM

http://unicode.org/faq/utf_bom.html

 

Q: Because most supplementary characters are uncommon, does that mean I can ignore them?

A: Most supplementary characters (expressed with surrogate pairs in UTF-16) are not too common. However, that does not mean that supplementary characters should be neglected. Among them are a number of individual characters that are very popular, as well as many sets important to East Asian procurement specifications. Among the notable supplementary characters are:

  • many popular emoji and emoticons

  • symbols used for interoperating with Wingdings and Webdings

  • numerous small sets of CJK characters important for procurement, including personal and place names

  • variation selectors used for all ideographic variation sequences

  • numerous minority scripts important for some user communities

  • some highly salient historic scripts, such as Egyptian hieroglyphics

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s