Unicode Case Conversion Challenges: The Turkish I Problem, German ß, and Locale-Aware APIs
JavaScript's toUpperCase() returns the wrong character for Turkish 'i', and uppercasing German ß changes the string length from 6 to 7 characters. Here's the Unicode case conversion edge cases that cause real bugs, locale-aware API alternatives in JavaScript and Python, and why title case rules differ across languages.
By sadiqbd · June 10, 2026
toUpperCase() is wrong for Turkish — and this reveals why text case conversion is more complex than it looks
Most developers assume case conversion is a simple problem: lowercase a becomes uppercase A, and every programming language handles this correctly. This assumption is wrong for several languages and causes real bugs in production software that handles international text.
The canonical example: the Turkish dotted/dotless I problem. Turkish has four i-related characters — dotted lowercase i (i), dotless lowercase ı, dotted uppercase İ, dotless uppercase I. In Turkish, lowercase i uppercases to İ (dotted capital), not I (dotless capital). JavaScript's 'i'.toUpperCase() in a Turkish locale returns 'İ' — correct in Turkish, but a function that was expected to return 'I' now returns something different.
The Turkish I problem in practice
The characters:
i(U+0069) — Latin small letter I (with dot above — implicit in Latin script)ı(U+0131) — Latin small letter dotless II(U+0049) — Latin capital letter I (dotless)İ(U+0130) — Latin capital letter I with dot above
In Turkish:
i→İ(uppercase of dotted i is dotted I)ı→I(uppercase of dotless i is dotless I)
In most other Latin languages:
i→I(standard uppercase)
The bug this causes:
// In a Turkish locale
'i'.toUpperCase() === 'İ' // Not 'I'
// Comparison that breaks
'FILE'.toLowerCase() === 'file' // true in most locales
'FİLE'.toLowerCase() === 'file' // false in Turkish — 'fıle' not 'file'
Real-world examples of this bug:
- Email address case-insensitive comparison:
user@example.comvsUSER@EXAMPLE.COM— the comparison fails in Turkish locale for addresses containing 'i' - URL normalisation: a URL containing 'i' normalised to uppercase in Turkish produces a different string
- Database case-insensitive queries: SQL
UPPER()function behaves differently on Turkish databases with Turkish collation
German eszett (ß) and uppercase
German has a character — the eszett or sharp S (ß) — that has no uppercase equivalent in traditional German orthography. It was historically uppercased as "SS":
Straße (street) → STRASSE
In 2017, the German Institute for Standardisation (DIN) introduced a capital ß (ẞ, U+1E9E) as an official uppercase character. However:
- Not all systems support it
- Traditional style guides still use SS in uppercase
- The capital ẞ is not consistently available in fonts
Programming implications:
'ß'.toUpperCase()in most JavaScript environments returns'SS'(two characters)- This means uppercasing a string with ß changes its length, which breaks assumptions about string length parity
'Straße'.toUpperCase().lengthis 7 (STRASSE), not 6
Case-insensitive matching challenges in Unicode
The correct approach for locale-aware comparison:
// Wrong: naive case comparison
'ISTANBUL' === 'istanbul'.toUpperCase() // May fail in Turkish locale
// Right: use localeCompare with sensitivity option
'ISTANBUL'.localeCompare('istanbul', 'tr', { sensitivity: 'base' }) === 0 // true
// Or use Intl.Collator
const collator = new Intl.Collator('tr', { sensitivity: 'accent' });
collator.compare('ISTANBUL', 'istanbul') === 0
Python's case-folding:
Python 3 provides str.casefold() which is more aggressive than str.lower() for Unicode case-insensitive comparison:
# lower() may not be sufficient for all Unicode comparisons
'ß'.lower() == 'ß' # True, no change
'ß'.casefold() == 'ss' # True, applies Unicode case folding
# For comparison:
'Straße'.casefold() == 'strasse'.casefold() # True
'Straße'.lower() == 'strasse'.lower() # False
Title case across languages
Title case in English capitalises the first letter of most words. This doesn't transfer to other languages:
German: nouns are always capitalised, regardless of position (der Tisch — "the table"). Title case as applied to German is inherently different.
French: minimal capitalisation in titles — only the first word and proper nouns. L'histoire de la ville not L'Histoire De La Ville.
Spanish: only the first word and proper nouns. La historia de la ciudad not La Historia De La Ciudad.
Arabic, Hebrew (RTL scripts): no concept of uppercase/lowercase — case conversion is not applicable. Title case tools should not apply case conversion to Arabic or Hebrew text.
Chinese, Japanese, Korean: no uppercase/lowercase distinction. Case conversion is inapplicable.
Programming language behaviour comparison
| Language | 'istanbul'.toUpperCase() (Turkish locale) |
Notes |
|---|---|---|
| JavaScript (browser) | Locale-dependent: İSTANBUL in Turkish locale |
toUpperCase() without toLocaleUpperCase() is locale-independent in some environments |
JavaScript toLocaleUpperCase('tr') |
İSTANBUL |
Explicitly locale-aware |
Python 'istanbul'.upper() |
ISTANBUL |
Python's upper() ignores locale by default |
Java "istanbul".toUpperCase(Locale.forLanguageTag("tr")) |
İSTANBUL |
Explicit locale required |
| SQL UPPER() with Turkish collation | İSTANBUL |
Collation-dependent |
How to use the Case Converter on sadiqbd.com
- Enter any text
- Select the target case — UPPERCASE, lowercase, Title Case, Sentence case, camelCase, snake_case, etc.
- Apply — note for text in Turkish, German, or other languages with case conversion edge cases, review the output for correctness
- Use for programming conventions — convert variable names, slugs, and labels between naming conventions
Frequently Asked Questions
Should I use toLowerCase() or toLocaleLowerCase() in JavaScript?
Use toLocaleLowerCase() with an explicit locale when processing user-generated text from known locales. For ASCII-only text (URLs, programming identifiers), toLowerCase() is sufficient. For database lookups or comparisons involving multilingual content, explicit locale-aware comparison is safer.
Does this matter for URL slugs? URL slugs are typically ASCII-transliterated before case conversion, which avoids these issues. A Turkish Ş becomes "s" in the slug, not its Unicode variant. The case conversion issue matters most for text comparisons in application logic, database operations, and search.
Is the Case Converter free? Yes — completely free, no sign-up required.
Case conversion looks trivial until you encounter international text. The Turkish I problem has caused bugs in major software, and the German eszett length-changing property breaks assumptions that seem universal. Locale-aware APIs exist in every major language — using them for user-facing text is the correct approach.
Try the Case Converter free at sadiqbd.com — transform any text between uppercase, lowercase, title case, camelCase, snake_case, and more instantly.