Current Location: Home> Latest Articles> How to handle multibyte character sets (such as UTF-8) in mb_convert_case

How to handle multibyte character sets (such as UTF-8) in mb_convert_case

gitbox 2025-05-29

Why use mb_convert_case instead of strtolower or strtoupper ?

The standard strtolower and strtoupper functions only support ASCII characters. When processing characters such as UTF-8 encoded, they cannot convert non-English characters correctly. For example:

 $text = "Γει? Σου Κ?σμε"; // Greek
echo strtoupper($text); // The output may be incorrect

At this time, mb_convert_case is needed:

 $text = "Γει? Σου Κ?σμε";
echo mb_convert_case($text, MB_CASE_UPPER, "UTF-8"); // Convert correctly to uppercase

Detailed explanation of parameters

The function prototype of mb_convert_case is as follows:

 string mb_convert_case(string $string, int $mode, ?string $encoding = null)
  • $string : The string to be converted.

  • $mode : conversion mode, mainly:

    • MB_CASE_UPPER : Convert to capitalization

    • MB_CASE_LOWER : Convert to lowercase

    • MB_CASE_TITLE : The capitalization of the first letter of each word

  • $encoding : Character encoding (such as UTF-8, GBK). If not specified, mb_internal_encoding() is used by default.


Share practical skills

1. Clearly specify the encoding

Many times, developers ignore the encoding parameters, resulting in the wrong conversion result. It is recommended to always write the code clearly:

 $text = "na?ve fa?ade résumé";
echo mb_convert_case($text, MB_CASE_UPPER, 'UTF-8');

Output: NA?VE FA?ADE RéSUMé

2. How to deal with Chinese?

mb_convert_case does not change the state of Chinese characters (because Chinese does not have upper and lower case). But you still need to set UTF-8 encoding, otherwise the string may be truncated.

 $text = "Hello World";
echo mb_convert_case($text, MB_CASE_UPPER, 'UTF-8'); // Output:Hello WORLD

3. First letter capitalization (title format)

Suitable for handling article titles, news titles, etc.:

 $title = "le petit prince";
echo mb_convert_case($title, MB_CASE_TITLE, 'UTF-8'); // Le Petit Prince

If there are HTML tags or entities in the text, please remove them first or do appropriate processing, otherwise the output will be affected.

4. Use with form data cleaning

 $name = trim($_POST['name']);
$cleaned = mb_convert_case($name, MB_CASE_TITLE, 'UTF-8');

Ensure that user input has a consistent format when displaying the interface, especially for document names, title bars, etc.


Sample application: Standardization of user profile display format

Suppose you need to format and display the name field in the user profile and support multilingual input:

 function formatUserName($name) {
    return mb_convert_case(trim($name), MB_CASE_TITLE, 'UTF-8');
}

echo formatUserName("éMILIE du chatelet"); // émilie Du Chatelet

You can also encapsulate this function in the API interface and standardize the returned JSON data in a unified format. The example is as follows:

 header('Content-Type: application/json');

$data = [
    'name' => mb_convert_case('sErGiO péRez', MB_CASE_TITLE, 'UTF-8'),
    'profile_url' => 'https://gitbox.net/user/sergioperez'
];

echo json_encode($data);