Unicode Case Conversion: The Turkish I Problem, German ß, and Locale-Aware APIs

JavaScript's toUpperCase() returns the wrong character for Turkish 'i', and uppercasing German ß changes the string length from 6 to 7 characters. Here's the Unicode case conversion edge cases that cause real bugs, locale-aware API alternatives in JavaScript and Python, and why title case rules differ across languages.

`toUpperCase()` is wrong for Turkish — and this reveals why text case conversion is more complex than it looks

Most developers assume case conversion is a simple problem: lowercase a becomes uppercase A, and every programming language handles this correctly. This assumption is wrong for several languages and causes real bugs in production software that handles international text.

The canonical example: the Turkish dotted/dotless I problem. Turkish has four i-related characters — dotted lowercase i (i), dotless lowercase ı, dotted uppercase İ, dotless uppercase I. In Turkish, lowercase i uppercases to İ (dotted capital), not I (dotless capital). JavaScript's 'i'.toUpperCase() in a Turkish locale returns 'İ' — correct in Turkish, but a function that was expected to return 'I' now returns something different.

The Turkish I problem in practice

The characters:

i (U+0069) — Latin small letter I (with dot above — implicit in Latin script)
ı (U+0131) — Latin small letter dotless I
I (U+0049) — Latin capital letter I (dotless)
İ (U+0130) — Latin capital letter I with dot above

In Turkish:

i → İ (uppercase of dotted i is dotted I)
ı → I (uppercase of dotless i is dotless I)

In most other Latin languages:

i → I (standard uppercase)

The bug this causes:

// In a Turkish locale
'i'.toUpperCase() === 'İ'  // Not 'I'

// Comparison that breaks
'FILE'.toLowerCase() === 'file'  // true in most locales
'FİLE'.toLowerCase() === 'file'  // false in Turkish — 'fıle' not 'file'

Real-world examples of this bug:

Email address case-insensitive comparison: [email protected] vs [email protected] — the comparison fails in Turkish locale for addresses containing 'i'
URL normalisation: a URL containing 'i' normalised to uppercase in Turkish produces a different string
Database case-insensitive queries: SQL UPPER() function behaves differently on Turkish databases with Turkish collation

German eszett (ß) and uppercase

German has a character — the eszett or sharp S (ß) — that has no uppercase equivalent in traditional German orthography. It was historically uppercased as "SS":

Straße (street) → STRASSE

In 2017, the German Institute for Standardisation (DIN) introduced a capital ß (ẞ, U+1E9E) as an official uppercase character. However:

Not all systems support it
Traditional style guides still use SS in uppercase
The capital ẞ is not consistently available in fonts

Programming implications:

'ß'.toUpperCase() in most JavaScript environments returns 'SS' (two characters)
This means uppercasing a string with ß changes its length, which breaks assumptions about string length parity
'Straße'.toUpperCase().length is 7 (STRASSE), not 6

Case-insensitive matching challenges in Unicode

The correct approach for locale-aware comparison:

// Wrong: naive case comparison
'ISTANBUL' === 'istanbul'.toUpperCase()  // May fail in Turkish locale

// Right: use localeCompare with sensitivity option
'ISTANBUL'.localeCompare('istanbul', 'tr', { sensitivity: 'base' }) === 0  // true

// Or use Intl.Collator
const collator = new Intl.Collator('tr', { sensitivity: 'accent' });
collator.compare('ISTANBUL', 'istanbul') === 0

Python's case-folding: Python 3 provides str.casefold() which is more aggressive than str.lower() for Unicode case-insensitive comparison:

# lower() may not be sufficient for all Unicode comparisons
'ß'.lower() == 'ß'         # True, no change
'ß'.casefold() == 'ss'     # True, applies Unicode case folding

# For comparison:
'Straße'.casefold() == 'strasse'.casefold()  # True
'Straße'.lower() == 'strasse'.lower()        # False

Title case across languages

Title case in English capitalises the first letter of most words. This doesn't transfer to other languages:

German: nouns are always capitalised, regardless of position (der Tisch — "the table"). Title case as applied to German is inherently different.

French: minimal capitalisation in titles — only the first word and proper nouns. L'histoire de la ville not L'Histoire De La Ville.

Spanish: only the first word and proper nouns. La historia de la ciudad not La Historia De La Ciudad.

Arabic, Hebrew (RTL scripts): no concept of uppercase/lowercase — case conversion is not applicable. Title case tools should not apply case conversion to Arabic or Hebrew text.

Chinese, Japanese, Korean: no uppercase/lowercase distinction. Case conversion is inapplicable.

Programming language behaviour comparison

Language	`'istanbul'.toUpperCase()` (Turkish locale)	Notes
JavaScript (browser)	Locale-dependent: `İSTANBUL` in Turkish locale	`toUpperCase()` without `toLocaleUpperCase()` is locale-independent in some environments
JavaScript `toLocaleUpperCase('tr')`	`İSTANBUL`	Explicitly locale-aware
Python `'istanbul'.upper()`	`ISTANBUL`	Python's `upper()` ignores locale by default
Java `"istanbul".toUpperCase(Locale.forLanguageTag("tr"))`	`İSTANBUL`	Explicit locale required
SQL UPPER() with Turkish collation	`İSTANBUL`	Collation-dependent

How to use the Case Converter on sadiqbd.com

Enter any text
Select the target case — UPPERCASE, lowercase, Title Case, Sentence case, camelCase, snake_case, etc.
Apply — note for text in Turkish, German, or other languages with case conversion edge cases, review the output for correctness
Use for programming conventions — convert variable names, slugs, and labels between naming conventions

Frequently Asked Questions

Should I use toLowerCase() or toLocaleLowerCase() in JavaScript? Use toLocaleLowerCase() with an explicit locale when processing user-generated text from known locales. For ASCII-only text (URLs, programming identifiers), toLowerCase() is sufficient. For database lookups or comparisons involving multilingual content, explicit locale-aware comparison is safer.

Does this matter for URL slugs? URL slugs are typically ASCII-transliterated before case conversion, which avoids these issues. A Turkish Ş becomes "s" in the slug, not its Unicode variant. The case conversion issue matters most for text comparisons in application logic, database operations, and search.

Is the Case Converter free? Yes — completely free, no sign-up required.

Case conversion looks trivial until you encounter international text. The Turkish I problem has caused bugs in major software, and the German eszett length-changing property breaks assumptions that seem universal. Locale-aware APIs exist in every major language — using them for user-facing text is the correct approach.

Try the Case Converter free at sadiqbd.com — transform any text between uppercase, lowercase, title case, camelCase, snake_case, and more instantly.

Unicode Case Conversion Challenges: The Turkish I Problem, German ß, and Locale-Aware APIs

`toUpperCase()` is wrong for Turkish — and this reveals why text case conversion is more complex than it looks

The Turkish I problem in practice

German eszett (ß) and uppercase

Case-insensitive matching challenges in Unicode

Title case across languages

Programming language behaviour comparison

How to use the Case Converter on sadiqbd.com

Frequently Asked Questions

Case Converter

More Case Converter Articles

Unicode Case Conversion Challenges: The Turkish I Problem, German ß, and Locale-Aware APIs

toUpperCase() is wrong for Turkish — and this reveals why text case conversion is more complex than it looks

The Turkish I problem in practice

German eszett (ß) and uppercase

Case-insensitive matching challenges in Unicode

Title case across languages

Programming language behaviour comparison

How to use the Case Converter on sadiqbd.com

Frequently Asked Questions

Case Converter

More Case Converter Articles

`toUpperCase()` is wrong for Turkish — and this reveals why text case conversion is more complex than it looks