Question 1

What is Unicode and why was it created?

Accepted Answer

Unicode is a universal character encoding standard that assigns a unique number (code point) to every character in every writing system — over 140,000 characters covering 150+ scripts, plus emoji, symbols, and control characters. Before Unicode, dozens of incompatible encodings (ASCII, Latin-1, Shift-JIS, GB2312) made text exchange between systems and languages unreliable. Unicode, now in version 15+, is the universal standard for all modern text.

Question 2

What is the difference between Unicode and UTF-8?

Accepted Answer

Unicode is the character set — the assignment of numbers to characters. UTF-8 is one encoding of those numbers into bytes. UTF-8 uses 1-4 bytes per character: ASCII characters (U+0000-U+007F) use 1 byte; Latin and common symbols use 2; most CJK characters use 3; rare characters and emoji use 4. UTF-8 is backward-compatible with ASCII and is the dominant encoding on the web.

Question 3

What are zero-width characters and why do they cause bugs?

Accepted Answer

Zero-width characters are Unicode code points that occupy no visual space: U+200B (Zero Width Space), U+200C (Zero Width Non-Joiner), U+200D (Zero Width Joiner), U+FEFF (BOM). They are invisible in most editors and UIs. They cause comparison bugs because 'hello' and 'hel​lo' look identical but are not string-equal. They appear in text copied from PDFs, web pages, and word processors.

Question 4

How do I type or insert Unicode characters I can't find on my keyboard?

Accepted Answer

On Windows: hold Alt and type the decimal code on the numpad (Alt+0169 for copyright). Or use Win+. for the emoji picker, or type the hex code in Word then press Alt+X. On macOS: System Preferences > Keyboard > Show Emoji & Symbols, or use the Character Viewer. In code: use escape sequences — \u00A9 in JavaScript/Python, &#169; or &copy; in HTML.

Unicode Explorer

About this tool

When to use it

Tips

Frequently asked questions

What is Unicode and why was it created?

What is the difference between Unicode and UTF-8?

What are zero-width characters and why do they cause bugs?

How do I type or insert Unicode characters I can't find on my keyboard?

Related tools