What are HTML entities?

HTML entities are text strings that represent characters which have special meaning in HTML or cannot easily be typed. They begin with & and end with ;. For example, < represents < (less-than sign), which would otherwise start an HTML tag. Named entities like © represent symbols like ©.

What is the difference between named and numeric entities?

Named entities use a human-readable name: & for &, © for ©. Numeric entities use the Unicode code point: decimal & or hexadecimal & for &. Both are equivalent in browsers. Numeric entities can represent any Unicode character even if no named entity exists; named entities are more readable when they exist.

Which characters must always be encoded in HTML?

The five characters that must always be encoded in HTML are: < → <, > → >, & → &, " → " (inside attribute values), and ' → ' (inside single-quoted attributes). In HTML5, the forward slash / does not need encoding but is sometimes encoded for XHTML compatibility.

Do HTML entities work in XML and XHTML?

In strict XML (and XHTML served as XML), only five predefined entities are available: <, >, &, ", and '. Other HTML named entities like © are not defined in XML and will cause parsing errors unless you declare them in a DOCTYPE. Use numeric entities (©) instead of named entities for maximum XML compatibility.

HTML Entities Encoder & Decoder

Encode special characters as HTML entities or decode entities back to plain text. Prevents XSS and ensures safe HTML rendering.

Input Text

Encode extended characters (non-ASCII) as &#xxx; numeric entities

Common HTML Entities Reference

Character	Entity Name	Numeric	Description
&	`&`	`&`	Ampersand
<	`<`	`<`	Less than
>	`>`	`>`	Greater than
"	`"`	`"`	Double quote
'	`'`	`'`	Single quote
	` `	` `	Non-breaking space
©	`©`	`©`	Copyright
®	`®`	`®`	Registered trademark
™	`™`	`™`	Trademark
€	`€`	`€`	Euro sign

Frequently Asked Questions

HTML entities are text strings that represent characters which have special meaning in HTML or cannot easily be typed. They begin with & and end with ;. For example, < represents < (less-than sign), which would otherwise start an HTML tag. Named entities like © represent symbols like ©.

HTML encoding is essential for: preventing Cross-Site Scripting (XSS) attacks (user input containing <script> is rendered as literal text instead of executed), displaying literal HTML source code on a page, safely including user content in HTML templates, and displaying characters like <, >, and & as content rather than markup.

Named entities use a human-readable name: & for &, © for ©. Numeric entities use the Unicode code point: decimal & or hexadecimal & for &. Both are equivalent in browsers. Numeric entities can represent any Unicode character even if no named entity exists; named entities are more readable when they exist.

The five characters that must always be encoded in HTML are: < → <, > → >, & → &, " → " (inside attribute values), and ' → ' (inside single-quoted attributes). In HTML5, the forward slash / does not need encoding but is sometimes encoded for XHTML compatibility.

HTML entity encoding prevents XSS in HTML contexts — when you are inserting untrusted data between HTML tags. However, it does not prevent XSS in all contexts. For JavaScript contexts (inside a <script> tag), use JSON encoding. For URL attributes (href, src), validate and use a URL-safe encoding. For CSS, use CSS-specific encoding. Context-appropriate encoding is required; HTML encoding alone is not sufficient for all injection points.

HTML encoding converts characters to &entity; format for safe inclusion in HTML markup. URL encoding converts characters to %xx percent-encoded format for safe use in URLs. They serve different contexts and are not interchangeable. A URL inside an HTML attribute should be URL-encoded first, then the whole attribute value HTML-encoded — for example: <a href="search?q=a%26b"> where %26 is URL-encoded and the outer HTML is safe.

PHP's htmlspecialchars() encodes the five critical HTML characters (&, ", ', <, >) and is the recommended function for output escaping in PHP templates. htmlentities() additionally encodes all characters that have named HTML entity equivalents. This tool's "encode" mode encodes the same five critical characters; enabling the extended option also encodes non-ASCII characters as numeric entities.

  (Non-Breaking Space, U+00A0) is a space character that prevents line breaks between two words. Use it when you want two words to stay on the same line (e.g., "100 km", "Mr. Smith") or to add spacing in HTML where normal spaces collapse. However, avoid overusing it for layout — CSS properties like white-space: nowrap or margin are more appropriate for layout purposes.

Yes. Numeric HTML entities can represent any Unicode code point: &#code_point; in decimal or &#xHEX; in hex. For example, the emoji 😀 is 😀 or 😀. This allows you to include any character in HTML documents that are encoded as ASCII or Latin-1, though for UTF-8 HTML documents (the modern standard) you can usually include Unicode characters directly without entity encoding.

In strict XML (and XHTML served as XML), only five predefined entities are available: <, >, &, ", and '. Other HTML named entities like © are not defined in XML and will cause parsing errors unless you declare them in a DOCTYPE. Use numeric entities (©) instead of named entities for maximum XML compatibility.

About This HTML Entities Encoder / Decoder

This free HTML entities encoder and decoder converts special characters to their HTML entity equivalents and back. All conversion runs in your browser — no data is sent to a server. Essential for XSS prevention and safe HTML output.

When to use this tool

Encoding user-generated content before inserting it into HTML
Decoding HTML entities from API responses or database content
Displaying literal HTML source code on a webpage
Learning which characters need escaping for safe HTML output

In-depth guides and technical articles.

View all →

Developer Jul 8, 2026

HTML Encoding Doesn't Stop XSS in JavaScript Context — Why CSP Nonces Are the Missing Second Layer

HTML encoding converts < to < — but it can't protect against XSS in JavaScript contexts, where the browser parses script blocks before HTML encoding applies. Content Security Policy blocks injected scripts at the browser level, using nonces (random per-request values) or hashes to allow only legitimate inline scripts. Here's why both layers are required, how CSP nonces work, and what violation reports reveal about attacker probing.

Developer Jun 22, 2026

In UTF-8, Most HTML Entities Are Unnecessary — But These Five Still Are

HTML entities were invented to survive character encoding translation before UTF-8 was universal. In today's UTF-8 world, é and é are identical — but five entities (&, <, >, ", ') remain essential because they escape characters that have structural meaning in HTML, not encoding meaning. Here's which entities are legacy, which remain necessary, and why   is a special case that's about rendering behavior rather than encoding.

Developer Jun 17, 2026

Double Encoding: Why "&amp;" Shows Up, and Why the "Quick Fix" Can Be Dangerous

"&amp;" appearing on a webpage instead of "&" is one of the most common HTML-entity bugs — an ampersand encoded twice, because encoding got applied at multiple uncoordinated points in a pipeline. Here's why this happens, why "encode once, at output, as late as possible" is the fix, and why "fixing" double-encoding by removing encoding from the wrong stage can quietly turn a cosmetic bug into an XSS vulnerability.

Developer Jun 14, 2026

Unicode Fundamentals: ASCII History, UTF-8 Encoding, Byte Order Marks, and Why Mojibake Happens

ASCII was designed in 1963 for 7-bit telegraph machines. Every country's attempt to extend it to 8 bits was incompatible, producing mojibake when files crossed systems. Here's how Unicode solved the problem, why UTF-8 became dominant (backward compatibility with ASCII), what byte order marks are, and what character encoding corruption actually looks like.

Developer Jun 9, 2026

XSS and HTML Encoding: The Five Contexts That Require Different Escaping

XSS is still the most common web vulnerability — and unescaped HTML is the mechanism. Here's how cross-site scripting actually works, the five encoding contexts that require different treatment, why React is safe by default but PHP isn't, and how CSP adds a second layer.

HTML Entities Encoder & Decoder

Frequently Asked Questions

About This HTML Entities Encoder / Decoder

When to use this tool

Related Developer Tools

Related Articles