-
Hajipur, Bihar, 844101
Character sets, or charsets, define the range of characters a webpage can display. In HTML, specifying the correct charset ensures that all letters, symbols, and emojis render correctly across browsers and devices. Using the wrong charset may result in garbled text, question marks, or missing characters. In this chapter, you will learn what HTML charsets are, why they are important, how to specify them in your webpages, and which charsets are most commonly used.
A charset defines the mapping between bytes and the characters they represent. Computers store text as numbers, and the charset tells the browser which numbers correspond to which characters. Without a proper charset, the browser cannot correctly interpret text from the HTML file.
If a file contains the character é:
UTF-8 interprets it correctly as é.
ISO-8859-1 may display it differently, sometimes as é.
Charsets ensure text displays consistently for all users, regardless of:
Operating system (Windows, macOS, Linux)
Browser (Chrome, Firefox, Edge, Safari)
Device (desktop, tablet, mobile)
Without specifying a charset:
Non-English letters may display incorrectly.
Symbols and emojis may break.
Multi-language content may be corrupted.
The charset is declared in the <head> section using the <meta> tag.
<meta charset="UTF-8">
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Charset Example</title>
</head>
<body>
<p>Accented letters: é, ñ, ü</p>
<p>Emoji: 😀 🚀 ❤️</p>
</body>
</html>
Here, UTF-8 ensures all accented letters, emojis, and symbols render correctly.
Most popular charset
Supports all Unicode characters
Recommended for modern websites
<meta charset="UTF-8">
Supports Western European languages
Limited symbol support
Mostly used in legacy websites
<meta charset="ISO-8859-1">
Microsoft variant of ISO-8859-1
Includes extra printable characters
Used in older Windows applications
<meta charset="Windows-1252">
16-bit Unicode
Supports all characters
Rarely used in HTML; usually for internal processing
UTF-8 is essential for multilingual content. For example:
<p>English: Hello</p>
<p>Hindi: नमस्ते</p>
<p>Arabic: مرحبا</p>
<p>Chinese: 你好</p>
Without UTF-8, Hindi, Arabic, and Chinese characters may appear as garbled text.
Form submissions must also consider charset. The accept-charset attribute specifies which charset to use when sending form data.
<form action="/submit" method="post" accept-charset="UTF-8">
<input type="text" name="name" placeholder="Enter your name">
<input type="submit" value="Submit">
</form>
This ensures that any special characters typed by the user are transmitted correctly.
If a webpage looks like this:
é ñ ü
It means the declared charset does not match the actual encoding of the file. Always save your HTML file in the same encoding as declared in the <meta charset> tag.
Most modern editors allow you to save files in UTF-8:
VS Code → File → Save with Encoding → UTF-8
Sublime Text → File → Save with Encoding → UTF-8
Notepad → Save As → UTF-8
Always verify that the encoding matches your <meta> declaration.
Before HTML5, charsets were declared like this:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
In HTML5, this is simplified to:
<meta charset="UTF-8">
The new syntax is preferred due to simplicity and efficiency.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>UTF-8 Example</title>
</head>
<body>
<h1>UTF-8 Charset Demo</h1>
<p>English: Hello</p>
<p>French: Bonjour</p>
<p>German: Grüß Gott</p>
<p>Spanish: ¡Hola!</p>
<p>Hindi: नमस्ते</p>
<p>Emoji: 😀 🚀 ❤️</p>
</body>
</html>
This example ensures that all languages and emojis render correctly in browsers.
HTML charsets define how text is interpreted and displayed in web pages. UTF-8 is the modern standard, supporting all characters, symbols, and emojis. Older charsets like ISO-8859-1 or Windows-1252 are limited and mainly used for legacy content. Correctly setting the charset in <meta> and saving your HTML files with the matching encoding ensures that your website displays multilingual text, symbols, and emojis accurately across all browsers and devices. Always use UTF-8 for new projects to avoid issues with text corruption.
Q1. Write the correct HTML <meta> tag to set charset to UTF-8.
Q2. Explain why UTF-8 is preferred over ASCII for webpages.
Q3. Show how to declare ISO-8859-1 charset in HTML.
Q4. Describe what happens if no charset is declared.
Q5. Write a webpage that displays characters from multiple languages correctly.
Q6. Explain the difference between UTF-8 and UTF-16.
Q7. Demonstrate how to set charset in HTTP server headers (example for Apache or Nginx).
Q8. Identify charset problems from given garbled text.
Q9. Explain why charset declaration should be as early as possible in HTML.
Q10. Create a sample HTML page with emoji and special symbols and correct charset.