Character Frequency Counter

Analyse any text to see how often each character appears. Filter by letters only or exclude spaces, and sort by frequency or alphabetically.

Input Text

Case-insensitive Ignore spaces Letters only

Frequency Table

Enter text to see character frequency

Frequently Asked Questions

Character frequency analysis has many uses: cryptanalysis — breaking simple substitution ciphers by matching the most common ciphertext letters to the most common English letters (E, T, A, O, I); linguistics — studying language patterns; data quality — identifying unexpected characters in structured text; and compression — frequency data informs Huffman encoding.

In typical English text, the most common letters by frequency are: E (~12.7%), T (~9.1%), A (~8.2%), O (~7.5%), I (~7.0%), N (~6.7%), S (~6.3%), H (~6.1%), R (~6.0%), D (~4.3%). The mnemonic "ETAOIN SHRDLU" captures the top 12.

Character frequency counts how often each individual character appears. Word frequency counts how often each distinct word appears. Character frequency is useful for cryptanalysis and compression; word frequency is used in NLP, keyword analysis, and text summarisation.

In a Caesar cipher, every letter is shifted by the same constant. The ciphertext frequency pattern is identical to plaintext, just shifted. By finding the most frequent ciphertext letter and assuming it represents E, you can infer the shift. With 50+ characters, this usually works in 1-3 tries.

A frequency histogram is a bar chart where each bar represents one unique value and its length shows how often that value appears. This tool displays horizontal bar charts that make it easy to compare character frequencies at a glance.

Yes, significantly. German has frequent E, N, I, S, R; French has frequent E, A, S, I, T; Spanish has frequent E, A, O, S, R. Language identification tools exploit these differences. Even within English, the distribution varies between fiction, technical writing, and news articles.

The rarest English letter in typical text is Z (~0.074%), followed by Q (~0.095%) and X (~0.15%). These low-frequency letters informed the high point values in Scrabble: Z = 10 points, Q = 10 points.

Zipf's law is a statistical pattern where the frequency of any item in a ranked list is inversely proportional to its rank. Applied to words: the second most common word appears half as often as the first; the third appears one-third as often. This appears in character and word frequency distributions across almost every natural language.

By default, all characters — including spaces, newlines, and punctuation — are counted. Enable Ignore spaces to exclude all whitespace. Enable Letters only to restrict counting to A-Z / a-z (combined when case-insensitive is on), ignoring digits, punctuation, and whitespace entirely.

In raw, unfiltered English text, the space character is almost always the most frequent — appearing roughly once every 5-6 characters. When spaces are excluded, the letter E is most frequent. This is why frequency tools typically offer "ignore spaces" and "letters only" filters.

About This Character Frequency Counter

This free character frequency counter analyses any text and displays how often each character appears, with a proportional bar chart and percentage. Supports case-insensitive counting, letter-only analysis, and multiple sort options.

When to use this tool

Frequency analysis for breaking simple substitution ciphers
Language detection and linguistic comparison studies
Data quality checks on structured text fields
Understanding character distribution for compression algorithms

Standards & References

In-depth guides and technical articles.

View all →

Text Jul 3, 2026

How Character Frequency Reveals Who Wrote a Text — Stylometry, Forensic Linguistics, and What It Detects in AI Writing

Stylometry identifies authorship from writing statistics — function words (the, in, upon, of) are largely unconscious style choices that persist across topics, making them better authorship signals than content words. Here's the Federalist Papers attribution case, character n-grams for below-word-level fingerprinting, what character frequency reveals about AI-generated text (more even vocabulary distribution, different punctuation patterns), and how byte frequency identifies text encoding.

Text Jun 19, 2026

How Counting Characters Can Identify a Language — and Why It Gets More Reliable With Every Word

A character-counting algorithm can identify a language from a short text sample because every language has a dramatically different character frequency signature — English peaks at E (12.7%), German at E (17.4%) with distinct umlauts, Spanish with high A and the ñ character. Here's how n-gram comparison against language profiles works, why accuracy improves with text length, where language detection fails (code-switching, similar languages, proper nouns), and applications beyond language detection.

Text Jun 17, 2026

Huffman Coding: How "E Is the Most Common Letter" Becomes Smaller ZIP Files and PNG Images

Every ZIP file, PNG image, and gzip-compressed page relies on the same observation as the previous article's frequency analysis: characters aren't equally frequent, so assigning shorter codes to common symbols and longer codes to rare ones saves space. Here's how Huffman coding builds optimal variable-length codes from frequency data, why "prefix-free" codes need no separators, how this fits into larger compression pipelines like DEFLATE, and why already-compressed data resists further compression.

Text Jun 17, 2026

Frequency Analysis: How Counting Letters Breaks Caesar Ciphers, Substitution Ciphers, and Why Modern Encryption Is Immune

A Caesar cipher can be broken in seconds — not by trying all 25 shifts, but by counting which ciphertext letter appears most often and matching it against English's most common letter, "E." Here's how frequency analysis breaks substitution ciphers, why polyalphabetic ciphers like Vigenère were designed to defeat it, and why modern encryption (AES, RSA) is immune to this entire category of attack.

Text Jun 10, 2026

Character Frequency and NLP Foundations: How Zipf's Law Underlies Search Engines and Language Models

Word frequency analysis underlies search engines, compression algorithms, and how large language models learn. Here's Zipf's Law, TF-IDF for meaningful keyword extraction, how word embeddings come from co-occurrence statistics, and why the character frequency distribution you measure is the same foundation that GPT models learn from.

Character Frequency Counter

Frequently Asked Questions

About This Character Frequency Counter

When to use this tool

Standards & References

Related Text Tools

Related Articles