Current Location: Home> Latest Articles> How to Set the Character Encoding Detection Order with PHP’s mb_detect_order() Function

How to Set the Character Encoding Detection Order with PHP’s mb_detect_order() Function

gitbox 2025-06-12

What is the mb_detect_order() Function?

In PHP, we often need to deal with character encoding issues in a multi-language environment. The mb_detect_order() function is one of the built-in functions in PHP that detects the character encoding of strings. It returns an array representing the order in which PHP will attempt to detect the encoding of a string. By setting the detection order, we can improve the accuracy of character encoding detection and better solve encoding issues.

Syntax of the mb_detect_order() Function

mixed mb_detect_order([mixed $encoding_list])

The mb_detect_order() function takes an optional parameter $encoding_list, which is an array of encoding names that specifies the order in which PHP will detect the encoding of a string. If this parameter is not provided, the function returns the current encoding detection order.

Examples of Using the mb_detect_order() Function

Example 1: Getting the Current Encoding Detection Order

$encoding_list = mb_detect_order();
print_r($encoding_list);

The code above will output the current encoding detection order array used by PHP. For example, the output may look like this:

Array
(
    [0] => ASCII
    [1] => UTF-8
    [2] => GB2312
    [3] => GBK
    [4] => BIG5
    [5] => JIS
)

From the output, we can see that mb_detect_order() tries to detect the encoding in the order of ASCII, UTF-8, GB2312, GBK, BIG5, and JIS.

Example 2: Setting the Encoding Detection Order

If you want to set a custom encoding detection order, you can use the following code:

$encoding_list = array(
    "UTF-8",
    "GBK",
    "GB2312",
    "BIG5"
);
mb_detect_order($encoding_list);

The above code sets the encoding detection order to UTF-8, GBK, GB2312, and BIG5. PHP will first try to detect the string using UTF-8 encoding, and then proceed with the other encodings.

From these two examples, we can see that the mb_detect_order() function allows you to either set the encoding detection order by passing an encoding array, or retrieve the current detection order by omitting the parameter.

Why Is It Important to Set the Encoding Detection Order?

In multi-language or international development, character encoding issues are common. If users input garbled characters on your website, you need to accurately detect their encoding type to properly parse and display the content. This is where PHP’s character encoding detection functions, such as mb_detect_encoding(), become essential.

mb_detect_encoding() depends on the encoding detection order set by mb_detect_order(). If no custom order is set, mb_detect_encoding() uses the default order. However, the default order may not be suitable for detecting certain encodings, especially when dealing with non-standard encodings. In such cases, setting the encoding detection order improves detection accuracy.

How to Set the Encoding Detection Order?

Before setting the detection order, it's important to understand some basic character encoding concepts. Different encoding methods represent characters in different binary formats. Common character encodings include:

  • GBK: Suitable for Chinese operating systems, supports Simplified and Traditional Chinese.
  • GB2312: Used in China, supports Simplified Chinese only.
  • UTF-8: A multi-byte encoding that can represent all characters in the world.
  • BIG5: Used in Taiwan, supports Traditional Chinese.
  • JIS: Used in Japan, supports Japanese.
  • ASCII: A 7-bit encoding that only supports English characters.

For multilingual support in your projects, you can customize the encoding order. For example, to support both Chinese and English, you might set the detection order to UTF-8, GBK, GB2312, and ASCII.

In practice, you can use mb_detect_order() to set the encoding detection order and apply it along with mb_detect_encoding() to detect and handle the correct encoding in your development projects.

Conclusion

In this article, we’ve covered the basics of the mb_detect_order() function in PHP and how setting the encoding detection order can improve the accuracy of character encoding detection. By customizing the detection order, we can handle encoding issues more effectively in multi-language development and enhance both the user experience and the robustness of our code.