---
id: html-charsets
title: "HTML Charsets"
category: "Frontend"
status: "draft"
verification_status: "conceptual"
canonical_id: ""
aliases: ["HTML character sets", "HTML encoding", "character encoding", "UTF-8", "ASCII", "ANSI", "ISO-8859-1", "charset meta"]
duplicate_of: ""
source_trust_level: "B"
confidence_score: 0.88
created_at: 2026-06-23
updated_at: 2026-06-23
review_reason: ""
merge_history: []
tags: ["html", "web", "frontend", "charset", "encoding", "utf-8", "w3schools"]
raw_sources: ["https://www.w3schools.com/html/html_charset.asp"]
applied_in: []
github_commit: ""
---
# [[HTML Charsets]]
## π― ν μ€ ν΅μ°° (One-line insight)
To display an HTML page correctly a browser must know which **character set (encoding)** the page uses; modern HTML declares it with ``, and UTF-8 β covering nearly all the world's characters β is the recommended and default choice. [S1]
## π§ ν΅μ¬ κ°λ
(Core concepts)
- **The charset attribute** β proper display requires the browser to know the page's character encoding, declared via ``. [S1]
- **UTF-8 is recommended** β it covers almost all of the characters and symbols in the world, and is the default character set in HTML5. [S1]
- **Historical progression of encodings** β ASCII β ANSI (Windows-1252) β ISO-8859-1 (HTML 4 default) β UTF-8 (modern default). Each newer set is largely backward-compatible with ASCII at code points 0β127. [S1]
- **Shared low range** β across ASCII, ANSI, ISO-8859-1, and UTF-8, code points 0β127 are identical. [S1]
## π§© μΆμΆλ ν¨ν΄ (Extracted patterns)
- **HTML5 charset declaration** β ``. [S1]
- **HTML4 charset declaration** β ``. [S1]
- **Backward-compatibility pattern** β newer encodings keep ASCII (0β127) intact and extend the upper range. [S1]
## π μΈλΆ λ΄μ© (Details)
**The HTML charset Attribute**
To display an HTML page correctly, a web browser must know the character set used in the page. Developers are encouraged to use UTF-8, which covers almost all of the characters and symbols in the world. The standard declaration is: [S1]
```html
```
**The ASCII Character Set**
ASCII was the first character encoding standard for the web. It defined 128 different Latin characters, including English letters (aβz, AβZ), numbers (0β9), and special characters such as `! $ + - ( ) @ < > . # ?`. [S1]
**The ANSI Character Set (Windows-1252)**
ANSI (Windows-1252) was the first Windows character set. Its layout: [S1]
- Characters 0β127 match ASCII.
- Characters 128β159 contain special characters.
- Characters 160β255 align with UTF-8.
HTML5 declaration: [S1]
```html
```
**The ISO-8859-1 Character Set**
ISO-8859-1 was the default character set for HTML 4, supporting 256 characters: [S1]
- Characters 0β127 are identical to ASCII.
- Characters 128β159 are unused.
- Characters 160β255 match ANSI and UTF-8.
HTML 4 syntax: [S1]
```html
```
HTML 5 syntax: [S1]
```html
```
**The UTF-8 Character Set**
UTF-8 character coverage: [S1]
- Values 0β127 match ASCII.
- Characters 128β159 are unused.
- Characters 160β255 align with ANSI / ISO-8859-1.
- From value 256 onward, UTF-8 extends to over 10,000 additional characters.
**Encoding comparison summary** [S1]
| Character set | Range 0β127 | Range 128β159 | Range 160β255 | Beyond 255 |
|---|---|---|---|---|
| ASCII | Latin characters (128 total) | β | β | β |
| ANSI (Windows-1252) | Same as ASCII | Special characters | Same as UTF-8 | β |
| ISO-8859-1 | Same as ASCII | Unused | Same as ANSI/UTF-8 | β |
| UTF-8 | Same as ASCII | Unused | Same as ANSI/8859-1 | 10,000+ characters from 256 onward |
The page also references many UTF-8 character-set categories, including Basic Latin, Latin Extended AβE, IPA Extensions, Spacing Modifiers, Diacritical Marks, General Punctuation, Super/Subscript, and Braille. [S1]
## π οΈ μ μ© μ¬λ‘ (Applied in summary)
The canonical applied case is the single `` declaration that should appear in the head of essentially every modern HTML page. No external project/commit applications found in the source.
## π» μ½λ ν¨ν΄ (Code patterns)
Declare UTF-8 in HTML5 (recommended):
```html
```
Legacy HTML4 ISO-8859-1 declaration:
```html
```
## βοΈ λΉκ΅ λ° μ ν κΈ°μ€ (Comparison & decision criteria)
- **ASCII** β only 128 Latin characters; the original web encoding, insufficient for international text. [S1]
- **ANSI (Windows-1252)** β the first Windows set; adds special characters in 128β159, but Windows-specific. [S1]
- **ISO-8859-1** β the HTML 4 default; 256 characters, still limited for global content. [S1]
- **UTF-8** β recommended choice and the HTML5 default; covers almost all of the world's characters and is backward-compatible with ASCII at 0β127. Use UTF-8 unless a legacy constraint forces otherwise. [S1]
## βοΈ λͺ¨μ λ° μ
λ°μ΄νΈ (Contradictions & updates)
No contradictions found in the source. The historical default shifted from ISO-8859-1 (HTML 4) to UTF-8 (HTML5), reflecting the move toward universal Unicode support. [S1]
## β
κ²μ¦ μν λ° μ λ’°λ
- **μν:** draft
- **κ²μ¦ λ¨κ³:** conceptual (μ€μ μ μ© μ¬λ‘ λ°κ²¬ μ applied/validatedλ‘ μΉκ²© κ°λ₯)
- **μΆμ² μ λ’°λ:** B (W3Schools β widely used educational reference, not a primary standards body)
- **μ λ’° μ μ:** 0.88
- **μ€λ³΅ κ²μ¬ κ²°κ³Ό:** μ κ· μμ± (New discovery)
## π μ§μ κ·Έλν (Knowledge Graph)
- **μμ/루νΈ:** [[HTML Tutorial]]
- **κ΄λ ¨ κ°λ
:** [[HTML Emojis]], [[HTML Symbols]], [[HTML URL Encode]], [[HTML Head]]
- **μ°Έμ‘° λ§₯λ½:** Referenced whenever defining how a page's text is encoded, especially for international characters and emojis.
## π μΆμ² (Sources)
- [S1] W3Schools β HTML Charsets β https://www.w3schools.com/html/html_charset.asp
## π λ³κ²½ μ΄λ ₯ (Change history)
- 2026-06-23: Initial draft synthesized from the W3Schools "HTML Charsets" page (Astra wiki-curation, P-Reinforce v3.1 format).